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ABSTRACT 

Current and upcoming wide-field, ground-based, broad-band imaging surveys promise 
to address a wide range of outstanding problems in galaxy formation and cosmology. 
Several such uses of ground-based data, especially weak gravitational lensing, require 
highly precise measurements of galaxy image statistics with careful correction for 
the effects of the point-spread function (PSF). In this paper, we introduce the SHERA 
(SHEar Reconvolution Analysis) software to simulate ground-based imaging data with 
realistic galaxy morphologies and observing conditions, starting from space-based data 
(from COSMOS, the Cosmological Evolution Survey) and accounting for the effects of 
the space-based PSF. This code simulates ground-based data, optionally with a weak 
lensing shear applied, in a model-independent way using a general Fourier space for- 
malism. The utility of this pipeline is that it allows for a precise, realistic assessment of 
systematic errors due to the method of data processing, for example in extracting weak 
lensing galaxy shape measurements or galaxy radial profiles, given user-supplied obser- 
vational conditions and real galaxy morphologies. Moreover, the simulations allow for 
the empirical test of error estimates and determination of parameter degeneracies, via 
generation of many noise maps. The public release of this software, along with a large 
sample of cleaned COSMOS galaxy images (corrected for charge transfer inefficiency), 
should enable upcoming ground-based imaging surveys to achieve their potential in 
the areas of precision weak lensing analysis, galaxy profile measurement, and other 
applications involving detailed image analysis. 

Key words: methods: data analysis - techniques: image processing - gravitational 
lensing: weak - galaxies: structure 



1 INTRODUCTION 

A tremendous variety of measurements are carried out on as- 
tronomical images from ground-based telescopes. A generic 
problem that often arises is the question of how the intrin- 
sically limited resolution of ground-based images (both due 
to convolution with the atmospheric point-spread function, 
or PSF, and due to the finite pixel size) affects our ability 
to measure quantities such as the radial profiles of galaxies 
/(r), or statistics of the profile such as its slope, half-light 
radius, and ellipticity. Moreover the error distributions of 
these parameters, which are often estimated via a highly 
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non-linear fitting procedure, are also unclear in many cir- 
cumstances. 

One application that particularly suffers from such 
uncertainties is the field of weak gravitational lensin, 



(for a review, see Bartelmann fc Schneiden 2001 
l2003al . iHoekstra fc JainI |2008| . or iMassev et al, " 



Refregiei 



2010al ). In 



the past decade, weak lensing has been used exten- 
sively for measurements that can constrain cosmology 
and galaxy formation. Cosmic shear measurements have 
constrained cosmological paranieters (e.g., most recently, 
IFu et al.l l2008: Schrabback ot al. 2 0101): cluster lensing anal- 
yses (e.g., Hockstra 2007 : Okabe et al.l 120101 ) have been 
used to understand the most massive structures in the 
universe, the abundance of which can constrain cosmol- 
ogy through the mass function (e.g., iRines et al.l 120071 : 
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iMantz et al1l2008l : IVikhlinm et"ai]|2009l : IRozo et alJboid ): 
and galaxy-galaxy lensing measurements have probed the 
connection between galaxies, their d ark matter halos , 
and their larger scale environments ( Schul z et alj l20ld : 
iLeauthaud et alj 120111 ). as well as constraining the the- 
ory of gravity on cosmologica l scales when com bined with 
other observational methods (JReves et al.ll201Cl ). The next 
decade promises a larger volume of weak lensing data that 
can be used for more precise constraints on cosmology and 
galaxy form ation, from surveys s uch as Hyper Suprime- 
Cam f HSC. iMivazaki et all I2OO6I'). Dark Energy Survey 
(DE^. iDark Energy Survey CollaborationI l2005l ) . the Kilo- 
Degree Survey (KID^, the Panoramic Surv ey Telescope 
and Rapid Response System (Pan-STARRqf|, iKaiser et alJ 
120101 ): and even more ambitious programmes are planned 
such as the Large Synoptic Su rvey Telescope (LSSTij, 
ILSST Science CoU aborationd l2009l '). Euclitd, and the Wide- 
Field Infrared Survey Telescope (WFIRSlfl). 

Weak lensing measurements depend on precise measure- 
ments of the shapes of galaxies, in an attempt to infer the 
apparent shape distortions in distant 'source' galaxies due 
to the mass in nearby 'lenses' (galaxies, clusters, or other 
mass distributions). Weak lensing is a statistical measure- 
ment, with averages over large numbers of sources in order 
to detect the ~ 0.1-1 per cent level distortions within the 
noise of the intrinsic galaxy ellipticities, which are typically 
a factor of ~ 100 larger. Unfortunately, coherent systematic 
distortions of galaxy shapes due to the PSF caused by the 
atmosphere (for a ground-based telescope), telescope optics, 
and detector are significantly larger than typical weak lens- 
ing distortions, which means that accurate PSF-correction 
is critical for current lensing studies, and all the more so for 
future lensing surveys that aim for < 1 per cent statistical 
errors. 

The weak lensing community has had several chal- 
lenges, using blind simulations, to identify the most promis- 
ing methods of PSF correction. The first of these , the S hear 
TEsting Programme-1 (STEPl: iHeymans et allbOOel '). in- 
cluded mock galaxies with idealized radial profiles and sev- 
eral PSFs meant to mimic specific observational issues (e.g., 
astigmatism). The second STEP2 jMassey et al. 2007a) , 
used shapelets (JRefregied l2003bl : iRefregier fc BaconI |2003| ) 
decompositions of COSMOS galaxies (including the COS- 
MOS PSF, for which no correction was imposed) as in- 
puts, and then convolved them with a var iety of ground- 
based PSFs from Subaru Suprime-Cam (|Mivazaki et al.l 
120021 ') . Finally, t he GRavitational lEnsing A ccuracy Testing- 
08 (GREAT08: [Bridle et al.l |2009| , |2010D and GREATIO 
l|Kitching et al.l l20ld ) challenges rever ted to com posite 
model galaxies (based on Sersic profiles; ISersid 119681 ) with 
very specific sets of parameters, and tested the recovery of 
the shear as a function of image S/N, PSF size, and galaxy 
profile type. All of these challenges were useful to the lens- 
ing community in difi'erent ways, and in some cases led to 
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changes in attitudes towards (or a greater understanding of) 
common methods of PSF correction. However, their ability 
to identify, in a broad sense, the most promising methods 
of PSF correction does not mean that they can be used 
to constrain, to high precision, the shear calibration in all 
lensing observations using those PSF-correction methodqj. 
There are numerous reasons why this is the case, such as 
galaxy models and observing conditions (PSF and depth) 
that are not r epresentative of a particular survey. STEP2 
(JMassev et al.ll2007a ) showed that the shear systematics for 
nearly every method of PSF correction depend on the ob- 
serving conditions (the galaxy S/N and resolution compared 
to the PSF); GREAT08 (Bridle et al. 2010) showed a depen- 
dence on galaxy type as well. 

More recent work has shown that the dependence of 
shear calibrat ion factors on t he gala xy p opulation is generic. 
In particular. iMassev et al.l (|2007bl ) and lZhang fc Komatsul 
(2011) showed that there is no stable shape measurement 
algorithm on finite-resolution data whose shear calibration 
factor is i ndepe ndent of the galaxy populatiq rQ. Instead, 
iBernsteinI l|2010l ) and IZhang fc KomatsrJ l|201ll ) argue that 
the same lensing survey that measures shear could also de- 
termine the range of galaxy models presented to us by the 
real Universe. While we will explore this point in more detail 
in Sec. O we conclude that to precisely constrain the shear 
calibration or understand observational selection biases in 
any given survey, the simulations must have realistic galaxy 
models as well as observing conditions. 

When constraining systematic errors in ground-based 
lensing data, we may wish to use space-based data as the 
basis of our simulations, due to its much higher resolu- 
tion. Indeed, one might ask why simulations are needed 
at all: can we simply rely on comparison with PSF- 
corrected shapes on space-base d images? This approach was 
taken bv lKasliwal et al.l ([2003), who co mpared s hape mea- 
surements using a K SB-based method (IKaiser et al. 1995|; 
iHamana et al.l 120031 ) on Subaru d ata against shape m ea- 
surements using the RRG method ({Rhodes et al.ll2000l ) on 
COSMOS data. However, there are some limitations to this 
approach. First, if one wishes to avoid complications due 
to a non-negligible PSF and therefore the need for substan- 
tial PSF correction in the space-based data, the compari- 
son must be restricte d to a subset of larger galaxies, as in 
iKasliwal et al.l (|2008h . More importantly, such a direct com- 
parison of the derived shapes or shears is not possible at all 



^ The validity of this statement depends on the data for which the 
shear calibration is desired. For example, the STEP2 simulations 
are likely to give a more realistic estimate of the calibration for 
the Subaru Suprime-Cam data that it was designed to mimic 
than for data from some other telescope (assuming that image 
combination and other steps in a realistic data analysis, which 
were not simulated, do not introduce additional biases). 
* The argument hinges on the existence of a finite number of 
well-measured moments Mij of the galaxy, and the fact that the 
dependence of the Mij on shear {dMij /d-fk) is determined in part 
by higher, unmeasured moments. A related issue occurs in Fourier 
space: when attempting to define a roundness test for a sheared 
galaxy, Bernstein (2010, last paragraph of section 4.2) finds that 
the finite extent of the modulation transfer function (MTF) pre- 
vents such a test from being shear-covariant, and argues that the 
issue is generic. 
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for some shape measurement methods if the elhpticities are 
defined in incompatible ways, as will be explained at greater 
detail later in this paper (Sec. 14.3]) . We thus conclude that, 
rather than carrying out a catalogue-level comparison, we 
should use the space-based data to make realistic simula- 
tions of ground-based lensing data, to which a shear can be 
added and shear recovery can be tested. 

In this paper, we therefore have three goals. First, we 
outline a method for simulating ground-based lensing data 
using higher-resolution data from space, including a care- 
ful treatment of the original space-based PSF and pixel 
sampling, and conversion to the new grou nd-based PSF 
and pixel sampling, inspired byJKaisei] ([20001). This method 
will allow the galaxies to be sheared, so that we can test 
the recovery of gravitational shear, including the many 
types of sele ction biases a nd P SF eff ects such as those 
descri bed in iHirata et al.l (|200J) and Mandclbau m et al.l 
l|2005l ). This method is a model- free generalization of that 
used in lDobke et al.l (|2010|) , which relies on shapelets decom- 
positions (Gauss-Laguerre basis functions) and therefore de- 
pends on the galaxy profiles and PSFs being well-described 
by sums of these functions to some finite order. Given that 
this assumption is not necessarily true for realistic galax- 
ies and PSFs fe.g.. lMelchior et al.ll2010r ). the advantage of a 
model-independent image simulation method is clear. Sec- 
ond, we describe a publicly-released implementation of this 
method in IDL. We emphasize that, while this paper fo- 
cuses on weak gravitational lensing, this simulation pipeline 
is equally applicable to many other science analyses that 
are commonly done using ground-based data, for example 
modeling of galaxy radial profiles. Finally, we demonstrate 
the method by simulating Sloan Digital Sky Survey (SDSS) 
lensing data, and show how these simulations can be used to 
estimate shear systematics in SDSS to high precision. This 
will be of practical use for interpreting past lensing mea- 
surements that used the SDSS shape catalogue we simulate, 
and eventually for reducing the systematic error budgets in 
future papers using that catalogue. 

We begin in Section [5] by describing weak gravitational 
lensing and its effects on galaxy shapes. Section [3] describes 
the process of removing the effects of the PSF from mea- 
sured galaxy shapes, so that lensing can be measured. The 
space-based data that are used as the basis of these simu- 
lations, and the ground-based data that we simulate in this 
paper, are described in Section 3] In Section [SI we describe 
the methodology that will be used to create accurate sim- 
ulations of ground-based data. We describe steps that we 
took to prepare the space-based data for this purpose in 
Section [6l and our specific implementation of the simulation 
method in Section [T] Technical tests of this implementation 
are presented in Section[Sl and an example of how the SHERA 
outputs can be used to test galaxy shape measurements is 
in Section [9] We discuss these results in Section [TOl 



2 WEAK LENSING BASICS 

Gravitational lensing is the defiection of light from distant 
objects ('sources') by all mass, including dark matter, along 
the line of sight ('lenses'). Typically, it results in a weak but 
coherent distortion in the shapes of distant galaxies (weak 
lensing). This distortion can be quantified by considering 



the true source position (3 and the observed position with 
respect to the lens; instead of the intrinsic surface bright- 
ness profile /(/3), we observe a perturbed profile I{d{/3)) 
described by the following Jacobian in the linear approxi- 
mation: 



de 
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which includes shear 7 = 71 -I- 172 = |7|e'^"'' and convergence 
K. These are in turn related to the deflection potential 
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and to the projected lens mass distribution via 
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(4) 



Here we have used a critical surface density (geometric fac- 
tor) defined aqj 
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li^ls 



(5) 



in terms of the angular diameter distances to the source 
(Ds), lens (-Di), and between the two (-Dis)- 

The vast majority of the weak lensing measurements 
to date have focused on the measurement of shear (shape 
distortions) rather than convergence (magnification), and 
therefore require extremely accurate measurement of the 
shapes of source galaxies. 



3 PSF CORRECTION 

A complicating factor in lensing measurements is that in 
practice, the galaxy shape that is observed has not just 
been lensed; it has also passed through an atmosphere (for 
a ground-based telescope), telescope optics, and a detector. 
This results in a convolution of the intrinsic galaxy pro- 
file with the point-spread function, or PSF. In this paper, 
we define the PSF as including not just atmospheric blur- 
ring, optic, and detector effects, but also pixelisation (the 
'effective PSF' or 'ePSF' in Bernstein & Jarvis 2002). For- 
tunately, the images of the stars provide a measurement, 
albeit a noisy one, of the PSF. 

In order to measure the weak lensing shear, we must 
remove the effects of the PSF on the source galax y shape. 
There are many methods o f doing so; see iMassev et al.l 
l|2007ah or lBridle et al.l (|2010l ) for summaries of the common 
methods of PSF correction. There are several types of bias 
that can arise when trying to extract the lensi ng shear using 
the PSF-corrected shapes (JHirata et al.ll2004h . for example: 



^ Eq. JSj is written in physical coordinates for simplicity; in co- 
moving coordinates additional factors appear. 
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(i) PSF dilution: The PSF tends to round galaxy shapes. 
If this rounding is not fully accounted for, it leads to a mul- 
tiplicative calibration bias that may depend on the galaxy 
profile, S/N , resolution, or PSF characteristics. 

(ii) Systematic shear: If the PSF has some nonzero ellip- 
ticity, then imperfect removal of that ellipticity can manifest 
as a small ellipticity added to each galaxy shape. Since PSF 
shapes tend to have coherent correlations across the sky, 
this leads to a spurious systematics signal in the lensing 
measurement. 

(iii) Selection bias: There are several types of selection 
bias. For example, for galaxies of a given apparent size, it 
may be easier to distinguish the more flattened ones from 
stars, and so the more flattened ones may be more likely 
than rounder ones to end up in a source galaxy sample. 
A selection bias that goes in the opposite direction is that 
some methods may have trouble extracting robust shapes 
for more flattened galaxies, thus selecting against them. 

(iv) Noise rectification bias: For sparsely sampled real- 
izations of an elliptical density distribution, the ellipticity 
tends to be over-estimated. This is an example of noise 
rectification bias, which is known to affect weak lensing 
measurements both in introducing spurious a dditive biases 
and affecting the calibration of the shear s (|Kaiserl l200d : 
iBernstein fc Jarvid[20o3 : lHirata et al.ll2004 ). The details of 
how it affects shape measurements depends on the details of 
how the shape measurements and PSF correction are per- 
formed. For an example calculation of noise rectification bias 
as a function of detection significance, for meth ods that mea- 
sure s hapes using adaptive galaxy moments, see lHirata et al.l 
l|2004 ). 

(v) Population uncertainties: Some methods of shape 
measurement rely on calibration factors that depend on the 
intrinsic properties of the galaxy population being studied, 
such as its ellipticity distribution. Since we only observe the 
ellipticities after the PSF and noise have been added, we 
are necessarily limited in how well we can infer the intrin- 
sic propertie s of th e sample. For some methods, e.g. as in 
iHirata et al.l (|2004f ). this results in the shear responsivity un- 
certainty. For other methods, such as Lensfit which uses a 
prior on the ellipticity distribution for Bayesian inference of 
galaxy shapes, the use of the w rong prior can res ult in an 
incorrect inference of the shear IjMiller et al.ll2007h . 

Because of the forms that these systematics take, shape 
measurement systematics have typically been quantified 
l|Hevmans et al.ll2006l : iMassev et aLlbOOTaT) using two num- 
bers: a multiplicative calibration bia^ij m, and an additive 
calibration bias c, relating the estimated shear 7 to the true 
one 7: 



7 — 7 = rwy + c. 



(6) 



An ideal method would have m = c = for all galaxy types, 
PSFs, and observing conditions (size of the galaxy relative to 
the PSF, and S/N). Unfortunately, even for existing meth- 
ods that have m and c that average to nearly zero for some 
galaxy populations, m and c typically vary with all of these 
factors (|Massev et al.ll2007al : feridle et ailbOld '). so that in 



^^ iMassev et al.l ll2007d) also allowed for the possibility of a non- 
linear response to shear, oc 7^^. 



conditions other than the ones they were tested on, these 
biases may be significant. 

It is worth explicitly contrasting the approaches to 
galaxy models that are commonly used for testing soft- 
ware used for weak lensing shear estimation. One common 
method is to use analytic formulae such as Sersic profiles, 
either individually or as multi-component models with a 
bulge and disk. The clear advantage of this approach is 
that one is in principle only limited by computer process- 
ing power and storage space in how many simulations to 
generate. The disadvantage is that these models do not, 
in detail, represent galaxy profiles well. For example, spi- 
ral arms are completely unrepresented by such an approach, 
and higher redshift galaxies (2 > 1) are mor e likely to be 
highly irreg ular and therefore unrepre sented. IMassev et al.l 
(|2007bl ) and lZhang fc Komatsul (120111 ) showed that there is 
no stable shape measurement algorithm on finite-resolution 
data that has a shear calibration factor that is indepen- 
dent of the galaxy population, because the shear operation 
inevitably couples the lower-order moments of a sheared 
galaxy to higher-order moments (which include not just the 
radial profile, but spiral arms, irregularity, etc.). A simula- 
tion containing simple models does not capture the higher- 
order moments of real galaxies, and so we do not expect 
that it will fully test for the (always present) dependence 
of shear calibration on the higher-order structure. Another 
example is that the single-component models lack elliptic- 
ity gradients (changes in the projected ellipticity and/or 
twists of the isophotes). These features are kn own to exist 
at a non-neglig i ble level in real galaxi es fe.g.. lLauenll985l . 
iHao et al.ll2006l . lKormendv et al.ll2009l ). and cause biases in 
shea r estimation for most extant shape measurement meth- 
ods (|Bernsteinll2010l ) . Finally, pure-Sersic simulations do not 
test for "underfitting" biase^ij in shear measurement meth- 
ods that fit Sersic profiles to galaxies. 

Another approach is represented by the STEP2 sim- 
ulations, which used shapelets decompositions of COSMOS 
galaxies (including the PSF, which was not removed). These 
simulations are therefore intrinsically limited by the cosmic 
variance in the COSMOS field. However, the clear advan- 
tage is that in principle, shapelets can (as an orthonormal 
basis set) represent any galaxy features. Unfortunately, due 
to the finite signal-to-noise of real data, it is necessary to 
truncate the shapelets expansion at some finite order. The 
consequence of this limitation has be en documented in the 
literature (e.g., iMelchior et al.ll2010l ). and results in diffi- 
culty accurately representing galaxies with high Sersic in- 
dex, because of the need to represent both the strong in- 
ner cusp and the large-scale wings of the profile. As shown 
there, this modeling difficulty can cause ~ 20 per cent 
biases when recovering the shear. Furthermore, the lower 
pre-seeing RMS galaxy ellipticity in the STEP2 simula- 
tions at faint magnitudes, erms = 0.20 at r = 26 versus 
0.35 at r = 22 (IMassev et al.ll2007al ) might arise primarily 
from the fact that the shapelets expansion was limited to 
a lower order for the fainter galaxy sample. For a circular 



^^ These are biases in an M-parameter fit to an image that arise 
when the true image has N > M parameters, and some of the 



interest; see, e.g.. IBernstein 2010 



a l pa r 
. iBern 
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shapelets basis, the restriction to lower order tends to give 
rounder galaxies '^^l. Support for this statement comes from 
the fact that PSF-corrected C OSMOS galaxy shapes using 
a non-shapelets based method IjLeauthaud et al.ll2007l ) have 
a roughly flat RMS ellipticity as a function of magnitude. 
Given that nearly all extant PSF-correction methods unfor- 
tunately have galaxy population-dependent (and ellipticity- 
dependent) shear calibrations, we cannot blindly use simula- 
tions that may not accurately represent some non-negligible 
part of the galaxy population to calibrate the shears coming 
from these methods to very high precision. 

Finally, we consider the approach advocated here, using 
SHERA to represent realistic galaxies. The current limitation 
set by the cosmic variance in the COSMOS field is unfortu- 
nate, but we can ameliorate this problem in the future by 
using the HST archive to expand the set of galaxies that 
can be used as the basis for simulations, both to other ACS 
fields and to include images from other HST instruments. 
Also, since we are using realistic galaxies without modeling 
assumptions, we are free from concerns that some method of 
shape measurement might appear to perform unfairly well 
because the galaxies were constructed using the same set of 
models used for PSF correction. More generally, we do not 
have to worry that we have missed features of the galaxy 
population that are problema tic for some or all m ethods of 
PSF correction. The results of lBridle et"aLl (|2010l '). and oth- 
ers cited in Sec. [TJ strongly suggest that if we want to pre- 
cisely estimate the bias due to use of some particular shape 
measurement method in real data, we must include realistic 
galaxies and observing conditions. However, this conclusion 
does not undermine the utility of the STEP and GREAT 
simulations. For example, the GREAT simulations provide a 
very well-defined way to test the performance of shape mea- 
surement methods as a function of particular parameters 
{S/N, PSF size, galaxy radial profile) while keeping other 
parameters fixed. This test provides valuable insight into 
the failure modes of particular methods, facilitating method 
development, whereas SHERA eff'ectively integrates over the 
parameters of the galaxy profile, PSF size, and other survey 
parameters to provide a good overall bias estimate, without 
necessarily revealing the details provided by the GREAT 
simulations. 



4 DATA 

In this section, we describe the space-based data used as 
inputs to the simulation pipeline, and the SDSS data that 
are being simulated as a test case. 



4.1 COSMOS 

The COSMOS Hubble Space Tele scope (HST) Advanced 
Camera for Surveys (ACS) field (|Koekemoer et all l2007l : 
IScoville et al.ll2007al lbr) is a contiguous 1.64 degrees^ region 



^^ In practice, we also expect some contribution to this round- 
ing from the fact that the ACS PSF was not removed from the 



centred at 10:00:28.6, +02:12:21.0 (J2000). Between Octo- 
ber 2003 and June 2005 {HST cycles 12 and 13), the re- 
gion was completely tiled by 575 adjacent and slightly over- 
lapping pointings of the ACS Wide Field Channel. Images 
were taken through the wide F814W filter ("Broad I"). In 
this paper we use the 'unrotated' images (as opposed to 
North up) to avoid rotating the original frame of the PSF. 
By keeping the images in the default unrotated detector 
frame, they can be stacked to map out the observed PSF pat- 
terns. The raw images a re corrected fo r charge transfer inef- 
ficiency (CTI) following iMassev et al.l l|2010br ). Image regis- 
tration, geometric distortion, sky subtraction, cosmic ray re- 
jection and the final combination of the di thered images are 
performed by the mult idrizzle algorithm iJKoekemoer et al.l 
|2002| ). As described in [Rhodes et all (120071 ) . the multidriz- 
zle parameters have been chosen for precise galaxy shape 
measurement in the co-added images. In particular, a finer 
pixel scale of 0.03"/pix was used for the final co-added im- 
ages (7000 X 7000 pixels). The source catalogue used in 
this pap er is constructed follow ing the methodology out- 
lined in iLeauthaud et al.l (120071 ) and then matched to an 
updated version (vl.7 dated from the P' of August 2009) 
of the COSMOS pho tometric redshift catalogue presented 
in lllbert et al.l (|2009l ). For the purposes of this paper, the 
following cuts are then applied: 

• F814W < 22.5: This cut allows us to start with a parent 
sample of galaxies that is deeper than what can be seen in 
SDSS, but stiU with very high S/N in the COSMOS images. 

• MU -CLASS = 1: This requirement uses the relationship 
between the object magnitude and peak surface brightness 
to select galaxies, and to reject stars and junk objects such 
as residual cos mic rays (the exact de finition of mu_CLASS 
can be found in ILeauthaud et al. 2007)^ 

• CLEAN = 1: As in ILeauthaud et~al] (|2003), this cut 
is required to eliminate galaxies with defects due to very 
nearby bright stars, or other similar issues. 

• GOOD_ZPHOT_SOURCE = 1: This cut requires that 
there be a good photometric redshift, which typically is 
equivalent to requiring that the galaxy not be located within 
the masked regions of the Subaru BVIz imaging used for 
photometric redshifts, and that it have a successful match 
against an object in the Subaru imaging. 

The first two cuts give an ideal parent sample of 33 517 
galaxies, and the latter two cuts (which are necessary in 
practice for manipulating the images) reduce that to 30 225. 
Some of these galaxies are too faint to be detected in SDSS, 
and some are too small to be resolved given the size of the 
SDSS PSF. For each of these galaxies, an ideal postage- 
stamp size is estimated as 



L(pixels) = ll^/(1.5ri/2)2-f 



1.2 



0.03 X 2.35 



(7) 



COSMOS galaxies. 



This estimate uses the SExtractoij^ (|Bertin fc Arnoutd 
IT996) FLUX_RADIUS (calculated with 

PHOT_FLUXFRAC= 0.5) as an estimate of the half- 
light radius ri/2, and (in a Gaussian approximation, with 
ri/2 ~ 0.7(t) adds it in quadrature with an SDSS PSF of 
FWHM = 1.2", a typical value. The factor of 0.03 converts 

^•^ http://www.astromatic.net/software/sextractor 
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Figure 1. The fraction of galaxies in the parent sample of 
clean FHIAW < 22.5 galaxy detections for which a postage 
stamp was generated, as a function of the COSMOS SExtractor 
FLUXJIADIUS. 



the FWHM in arcsec to the COSMOS pixel scale, and 
the 2.35 is required to express the typical SDSS 1.2" PSF 
FWHM as a Gaussian a. It then requires that the postage 
stamp go to more than ±5o" in the predicted galaxy size 
after convolution with the target PSFI^ If this postage 
stamp size causes the postage stamp to hit the CCD edge, 
then the galaxy is eliminated. Likewise, those galaxies for 
which the ideal postage stamp size exceeds L — 1000 pixels 
were eliminated (119 objects), resulting in an intrinsic upper 
limit in the values of FLUX_RADIUS for which postage 
stamps were generated. Consequently, the probability of a 
galaxy in our parent sample having a postage stamp is a 
weak function of the observed galaxy size, specifically the 
FLUX_RADIUS. This probability is shown in Fig. [T] when 
computing statistics of the sample, we must weight by the 
inverse of this curve to remove this artificial selection effect 
and obtain a fair galaxy sample. 

After these cuts to ensure that the postage stamp can 
be generated, the sample for which there are postage stamps 
contains 26 113 galaxies. 



4.2 SDSS 



The SDSS l|York et all |200G| ) imaged roughly n steradi- 
ans of the sky, and followed up approxim ately one million 
of the detected objects spectroscopically ([Eisenstein et al.l 
I2OOII : [Richards et all [2003 : IStrauss et al.ll2002l ). The imag- 
ing was carried out by drift-scanning the sky in photo- 

^'* Note that FLUX_RADIUS is an azimuthally-averaged quan- 
tity. Thus, for highly flattened objects, we may be concerned that 
PSF convolution will cause them to become so large that they do 
not fit on the generated images. We test explicitly whether this 
is the case before using the resulting postage stamps. 



metric conditions IIHogg et al.l 200ll: Ivezic et al. 2004), in 
five bands (ugriz) (JFukugita et al.lll996l : ISmith et al.ll2002h 
using a specially-designed wide-field camera ( Gunn et al.l 
[1993). These imaging data were used to create the clus- 
ter and source catalogues that we use in this paper. All of 
the data were processed by completely automated pipelines 
that detect and measure photometric p roperties of objects , 
and astrometrically calibrate the d ata IjLupton et al.ll200ll : 
iPier et al.|[20o3 : iTucker et al.ll2006D . The SDSS I/II imag- 
ing surveys were com pleted with a seventh data release 
(jAbazaiian et al.ll2009l ). though this work will rely as well 
on an improved data reduction pipe hne that was part o f the 
eighth data release, from SDSS III (JAihara et al.ll201lh . 



4.3 Shape catalogue 

The catalogue of source galaxies with shape measure- 
ments that we are simulating in this work is described 
in Reyes et al. (2011, in prep.), and is an upda te of 
that originally described in iMandelbaum et al.l (|2005|) with 
additional area and several technical improvements. This 
source sample has over 42 million galaxies from the SDSS 
imaging data with r-band model magnitude brighter than 
21.8, with shape measurements obtained using the RE- 
GLENS pipeline, incl uding PSF correction done via re- 
Gaussianization fH irata fc Seliakl ^2003) and with cuts de- 
signed to avoid various shear calibration biases. Among 
those cuts are a flux limit of r < 21.8, and the require- 
ment that the PSF-corrected shape be measured in both r 
and i bands with sufficient resolution (to be defined more 
quantitatively later in this section). 

Using the software developed in this work, we hope to 
more tightly constrain the shear calibration, including the 
full list of possible biases from Sec. El than was originally 
possible in IMandelbaum et al.l l|2005l ) (which had allowed 
for an overall 2a shear calibration uncertainty of [—5, 4-12] 
per cent for r < 21 galaxies, and [—8, 18] per cent for r > 21 
galaxies) . 

One of the technical difficulties that complicates any 
direct comparison of these shape measurements with those 
from the COSMOS catalogue is the definition of the shapes. 
Both the RRG method and re-Gaussianization define ellip- 
ticity in terms of a moment matrix M via 



M, 



My 



ei 



62 



M^^ + My 



2M^ 



M^^ + Myy 

which translates to an ellipticity definition 



1 - <£« 



1 



<&« 



(8) 



(9) 



for some effective minor-to-major axis ratio Qeff = feeff/fleff. 
However, the determination of the moment matrix M, and 
therefore the ei, 62, and gcff, is done differently for the two 
methods. 

In general, the definition of moments according to each 
method uses 



(method) 



= / I{x)w^rict\-iod{x){x-Xo)i{x-XQ)jAx. (10) 



M. 



The RRG method, used to PSF-correct the COSMOS galaxy 
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shapes, defines second moments with a weight function 



WRRG(a^) = exp 



^ +y 



(11) 



where r^ is a galaxy size estimate calculated fi'om the SEx- 
tractor detection area. Thus, this method uses a circularly- 
weighted moment with a fixed radius. In contrast, the re- 
Gaussianization method uses adaptive moments, which en- 
tails minimizing the integral 



E 



I(x) — Aoscp 



1 



a5-a;o)^M ^{x-xo) 



(12) 

over the quantities (A, a3o,M). This procedure amounts to 
weighting by ■w^^'^^'^^\x) corresponding to the best-fitting 
elliptical Gaussian that represents the image itself, which in 
practice is determined iteratively. 

Analytical calculations show that for an elliptical Gaus- 
sian profile, the difference between the circular vs. elliptical 



weight functions means that the two ellipticities |e 



(RRG) I 



and 



^(adapt), will be related as 



(adapt) 



2|e 



(RRG) I 



l_^|e(RRG)| 



(13) 



Furthermore, for non-Gaussian light profiles, e''"*''^'' does 
not depend on any assumed radius whereas e'''^''^ ' does; 
more problematically, for a profile with fixed axis ratio, 
changing the profile from Gaussian to a more general pro- 
file with elliptical isophotes (e.g. Sersic profiles) does not 
modify |e(=''*='P''| whereas it does change |e''^'^'^^|. Finally, in 
the presence of ellipticity gradients with radius, the differ- 
ent way of choosing the effective radius of the weight func- 
tion will result in different measured ellipticities. So, we can- 
not compare individual estimates of the galaxy shapes from 
the two PSF correction methods in any obviously model- 
independent way. We therefore conclude that the way for- 
ward is a simulation of the ground-based image using the 
high resolution COSMOS image, in order to directly test 
the accuracy of shear recovery. 

For the purpose of this work, we define the 'resolution 
factor' R2 using the trace of the adaptive moment matrices, 

T = Af,, + Myy (14) 

where t'^^ and T'^^ are the traces for the PSF and for the 
PSF-convolved galaxy image, respectively. Then the resolu- 
tion factor is 



R2 = 1 






(15) 



which approaches 1 in the limit that the galaxy is perfectly 
resolved, and in the limit that it is completely unresolved. 
Our requirement on the resolution factor is R2 > 1/3. In 
the limit of Gaussian PSF and galaxy with standard devia- 
tions CTgai and crpsF, respectively, this resolution factor cut 
corresponds to agai > apsp/V^. 



5 SIMULATION METHODOLOGY 

In this section, we discuss the principles behind simula- 
tions of lower-resolution (ground-based) data from higher- 
resolution (space-based) data. First, we define some nota- 
tion. 



We assume that a galaxy is described by an intrinsic 
unknown surface brightness function /(x) as a function of 
angular position x. We are given a high-resolution image 
(COSMOS) with some effective PSF Gi(x), i.e. the observed 
surface brightness is 

/i (x) = [/ * Gi] (x) = I /(x')Gi (x - x') d^x'. (16) 

We would like to generate a low-resolution image I2 
(corresponding to what would be observed by some ground- 
based telescope with PSF G2), 

J2(x) = [/*G2](x) = |/(x')G2(x-x')dV. (17) 

While this equation may initially appear to represent a triv- 
ial PSF-matching process (Sec. 15. 1) . there is a complicating 
factor that arises if we want to represent a sheared image 
(Sec. 15. 2) ). since shearing and convolution by the space-based 
PSF do not commute. 



5.1 PSF matching 

First, we consider the case in which we simply wish to gen- 
erate a low-resolution image without any added shear. In 
that case, the task of generating a low-resolution image I2 
is simply a matter of PSF-matching. The simplest model- 
independent way to do this is to work in Fourier spacq^. In 
that case, the convolutions with the PSF are multiplications: 



/i(k)=/(k)Gi(k) 



(18) 



and likewise for the low resolution image 72. 

In that case, once we have a Fourier-space image Ji, 
original PSF Gi and target PSF G2, we can simply generate 
the low-resolution image via 



ii. 



(19) 



Naturally, we must place some conditions on the low- and 
high-resolution PSFs in order to carry out this PSF match- 
ing. As a rule, PSFs are band-limited at some wavenumber 
kc above which there is essentially no power (this corre- 
sponds to the small scales below which there is no informa- 
tion about the galaxy profile). By definition, the PSFs in 
high and low resolution data tend to satisfy |Gi| > IG2I at 
all wavenumbers k, with the band limit kc.i > kc,2', for an 
example of how this relation is satisfied for typical COSMOS 
and SDSS data, see Fig. (2] and the corresponding real-space 
PSF images in Fig. [S] In fact, this inequality is a require- 
ment for numerically stable and model-independent PSF- 
matching; if it is not satisfied, then the operation in Eq. (|19|) 
corresponds to deconvolution for at least some wavenum- 
bers, which will lead to undesired image properties such as 
ringing. This deconvolution can be done in the context of 
model-fitting methods; however, the meaning of the small- 
scale powe r recovered in th e process of such a deconvolution 
is unclear (|Bernsteinll201(]| ). 



^^ We indicate Fourier-space quantities witii a tilde, and the dis- 
tances in pixels in real and Fourier space are x and fc, respectively. 
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Figure 2. An example of the relevant scales for the PSF in SDSS 
and COSMOS at a randomly-selected point in the COSMOS field. 
The plotted quantity is the azimuthally-averaged PSF power \G\'^ , 
as a function of wavenumber. As shown, the band limit of the 
SDSS data is on scales where the Fourier transform of the COS- 
MOS PSF is still close to 1. 



5.2 Introducing a shear 

Now, we consider the less trivial case where we want to sim- 
ulate a ground-based image with an added shear, for the 
purpose of testing PSF correction. We denote this sheared 
ground-based image Jj (x) to distinguish it from the un- 
sheared ground-based image /2(x) considered in the previ- 
ous subsection. 

To define the problem more clearly, we imagine a Ja- 
cobian matrix J that transforms the observed (post-shear) 
coordinates Xo to the intrinsic (pre-shear) coordinates Xi on 
a galaxy: 



Xi = JXo. 



(20) 



The Jacobian J is thus a 2 x 2 matrix. Usually J will 
be close to the identity; indeed, to first order in the shear, 
J is simply given by Eq. ([T]). We will denote the singular 
values of J by 5'± , with S- 4: S+, and det J = 5*- S-h . If J 
is symmetric, then S± are also the eigenvalues. 

We then have a sheared galaxy image 



/o(Xo) = /(JXo), 

and the observed sheared image is 

4^'(Xo) = [/o*G2](Xo) = /',f(Jx')G2(Xo- 



(21) 



(22) 



It is assumed that the S/N of the input images is large 
enough that the noise is negligible. (Clearly the output im- 
age may be made to have arbitrary levels of noise by adding 
noise at the end.) For the situation considered here, we thus 
limit ourselves to relatively bright detections in COSMOS 
{S/N > 50). In Sec. 18.51 we demonstrate the effects of this 





Figure 3. Top: Real-space image of the Tiny Tim COSMOS 
PSF for which the azimuthally averaged Fourier space power was 
shown in Fig. [2] The image is shown on a logarithmic stretch, 
in order to display the diffraction rings. Contours are shown for 
a flux level equal to 0.5 and 0.1 times the maximum flux level. 
Bottom: Same as top, for the SDSS PSF; the low-level patterns in 
the outer regions are due to a small amount of noise in the PSF 
model. 



low level of noise in the input images on the simulated im- 
ages, given that it is sheared and therefore could contribute 
to an estimated shear. We defer the development of a for- 
malism to account for non- negligible noise levels in the input 
high-resolution data to future work that may use lower S/N 
space-based data. 

We assume that the PSFs Gi and G2 satisfy the follow- 
ing band- limiting constraints: first, that 1(52(14)1 = (or at 
least is negligible) for |k| > kc for some kc (the band limit); 
and second, that 1(51(14)1 is nonzero (and in practice we as- 
sume it is far from zero, e.g. > 0.5) for jkj < kd for some kd- 
We impose the assumption that 



Kc ^ i^— '>'d, 



(23) 
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which in practices amounts to requiring a significant range of 
scales on which the ground-based PSF erases all information 
that is still resolvable in the space-based images. 

While the PSF-matching problem can be simply for- 
mulated as trying to take /i(x) and obtain hi^), the case 
where we want to introduce a shear for testing purposes (a 
simulated gravitational shear, which changes the galaxy im- 
age before the imposition of the PSF) instead requires us to 
infer 7^^' (xo). 

The principle behind the solution is to attempt a partial 
deconvolution of Ji. We now define a filter r(x) that has a 
Fourier transform satisfying 



f(k) 



1 



Gi(k) 



for Ikl < fed 



(24) 



(Here we have used our band- limiting condition on Gi.) For 
|k| > kd an arbitrary choice of T'(k) may be made, e.g. it 
may be taken to decrease smoothly to zero so that the real- 
space function T(x) does not have tails at large radii. 

Then we define the pseudo-deconvolved image P(x) by 



P(x) = [h * r](x) = f Ji(x')T(x - x') d'x'. 



(25) 



or in Fourier space, 

P(k) = /i(k)f (k) = /(k)Gi(k)f (k), (26) 

so that P(k) = /(k) for \k\ < k^. 

Our second step is to shear the pseudo-deconvolved im- 
age, i.e. we create 



Po(Xo) = P(JXo). 



(27) 



The Fourier transform is given by the usual rule for the 
transform of a quantity with a linear shear, 



^°(^°)-uiiT^(-^""'^°) 



(28) 



The singular values of J"^ ^ are S_j_^ . Therefore, we see that 
if Ikol < kc, then: 



IJ^^'kol s^ST^ol <SZ'fcc<fcd 



(29) 



Therefore, Po(ko) = /o(ko) for |ko| < kc. 

Finally, we convolve Po with the target low-resolution 
PSF G2 to get 

J7(Xo) = [Po * G2](Xo) = / Po(Jx')G2(Xo - x') d^x'. (30) 



(31) 



In this case, we have 

^(ko)=Po(ko)G2(ko). 



There are now two cases: |ko| is either (i) < kc or (ii) ^ kc- 
We consider each of these: 

• If |ko| < kc, then Po(ko) = /o(ko) so H{ko) = /^^'(ko). 

• If |ko| ^ kc, then G2(ko) = so H{ko) = /:^^^(ko) = 0. 

In either case, we have H(ko) = /2 (ko), so it follows that 
H{xo) = Jj (xo). Therefore, the function H that we have 
constructed is exactly the sheared image that would be ob- 
served with the low-resolution telescope. 



5.3 Other observational issues 

To demonstrate that the PSF-matching is accurate, we will 
determine the target PSF and noise level using the observing 
conditions at the position of the COSMOS galaxy in the 
SDSS imaging. This procedure will allow us to compare the 
simulations with the real SDSS imaging in the COSMOS 
field, as a basic test of the SHERA code. Likewise, we will use 
these simulations to determine the shear calibration bias, as 
a demonstration of the method. However, in order to make 
a fair test of the shear calibration of the entire SDSS shear 
catalogue, it would be necessary to draw random points from 
within the SDSS area and use the observing conditions from 
those points. The difference between these two approaches 
would not be significant if the quality of the SDSS imaging 
in the COSMOS field was representative, but as discussed 
in Appendix 1X1 this is not the case. 

Also, in order to fairly test the shear calibration, we 
must impose any selection criteria from the real data on 
the simulations. This will also allow for a test of selection 
biases, since the input galaxy sample is a fair sample of all 
galaxies brighter than some flux limit {F814:W < 22.5) that 
is deeper than the SDSS flux limit in the single epoch images 
(r < 21.8). 

Finally, one limitation of these COSMOS images that 
we use as the basis for our simulations is that they only 
are in P814VK (broad I). The existence of colour gradients 
means that the galaxies would look morphologically different 
in other passbands, and the effect is probably the strongest 
for galaxies with a reasonable-sized bulge and disk, for which 
the bulge is more prominent in red bands and the disk in 
blue bands. However, since most lensing analyses use r or J 
for shape measurement , this does not represent a signiflcant 
limitation (|Voigt et al.ll201ll 'l. It is an argument, however, 
for using another field with multi-band data, provided that 
(a) the CTI effects (see section l47l|) are well- understood and 
corrected for properly, and (b) the field was chosen in some 
fair way (e.g., it is not a galaxy cluster field, which would 
have an atypical morphology distribution). 

For our purposes in SDSS, we can use simulations with 
a fair distribution of observational conditions to precisely 
determine the i-band shear calibration. Since our science 
analyses use averaged r and i band shapes, we can then use 
the data itself to determine the ratio of the measured signal 
to that with just i band. The measurements in r and i are 
highly correlated because the shape noise is the same, so this 
allows the shear calibration for the actual science analyses 
to be determined very precisely. 



6 IMAGE PREPARATION 

Before we can actually carry out the image simulations, 
there are a number of steps that must be done to process the 
galaxy postage stamp images described in Section BTTI These 
steps have been carried out already for the images that are 
being publicly released^'' I. Thus, these newly released images 
differ from the already-public version 2 COSMOS images in 
that they are restricted to a bright subsample of galaxies. 



^^ http : //irsa. ipac . caltech. edu/data/COSMQS/ images/ 
gal axy_postage_st amps/ 
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they have a smaller post-processing pixel scale (0.03"), they 
include the pixel-based CTI correction, and include the post- 
processing described in the remainder of this section. 



6.1 Catalogue of galaxy properties 

We have used the COSMOS data to prepare a number of 
inputs that can be used for simulations of SDSS i-band im- 
ages. 

First, we need a total galaxy flux in i band, whereas 
the COSMOS images are in _F814W. A significant fraction 
of the galaxies that constitute our parent sample are de- 
tected in SDSS, and therefore have measurements of the 
i-band flux. However, these measurements are far noisier 
than the flux measurements from the COSMOS data, so we 
wiU use the COSMOS F9,14W fluxes to determine the nor- 
malization of the flux in the simulated imagesQj We begin 
with the reported SExtractor MAG_AUTO magnitudes in 
-F814W^, which we correct for galact ic extinction using the 
dust maps from lSchlege l ct al. ( 1998j and the extinction-to- 
reddening ratios from [Stoughton ct al. (2002). These mag- 
nitudes are designed to precisely determine the total magni- 
tudes for galaxies, similar to Kron magnitudes (|Kronlll98Cl ). 
and are superior to aperture magnitudes in recovering all 
the galaxy flux. 

In order to account for slight differences in the two fll- 
ters, we then compare the extinction-corrected COSMOS 
MAG_AUTO and SDSS model magnitudes for reasonably 
bright galaxies (i < 20), and determine a mean offset of 
0.03 mag, i = _F804W - 0.03. This mean offset is then sub- 
tracted from the _F814W^ magnitudes for all galaxies in the 
parent sample. 



6.2 Postage stamp preparation 

There are several types of processing that must be done 
to the original CTI-corrected galaxy postage stamps before 
using them as in puts to the simulations. F or this purpose, we 
use SExtractor l|Bertin fc Arnoutj|l996l ') v2.5.0. First, we 
run SExtractor with the COSMOS multidrizzle weight map 
for each postage stamp, instructing it to subtract off a flat 
background level equivalent to the residual background after 
the original multidrizzle processing was completed. We do 
not allow it to carry out its own sky determination because 
the size of the postage stamps is not sufficient for it to do so 
without being unduly influenced by the light of the galaxies 
in the postage stamp. 

The results of this step provide us with a list of de- 
tected objects in the postage stamp. Ideally, if deblending is 
properly done, then one of those (our target galaxy) will be 
at the centre of the postage stamp. We use the SExtractor 
output segmentation image to identify all pixels belonging 
to objects other than the target galaxy, and replace those 
pixels with a noise field having the same noise characteris- 
tics as the rest of the postage stamp (including overall noise 
amplitude as well as pixel-to-pixel correlations). The num- 
ber of masked objects is < 18 (< 60) for 50 (95) per cent 



^^ If we use the SDSS magnitudes, then we include noise in the 
measurement twice, since the SDSS magnitude measurements are 
noisy due to the sky noise that we then put into the simulations. 



of target galaxies, resulting in 0.9 (6) per cent of the pixels 
being masked; the majority of the masked objects are quite 
small and faint, with some appearing to be misidentifica- 
tion of the correlated noise as actual objects. This object 
masking procedure is dependent upon SExtractor correctly 
identifying all pixels belonging to other objects; it is not fully 
successful with some very bright objects, leaving a halo of 
pixels containing a low, but visually noticable, level of light 
surrounding the masked regions. Fortunately the incidence 
of such cases is low. These processed postage stamps are 
included with the code and data release. 

While carrying out this procedure, we compute addi- 
tional statistics for each postage stamp, including (a) the 
distance of the nearest object to the postage stamp centre 
(which we require to be ^ 5 pixels, after visual inspection 
of cases failing this cut suggested that those cases suffered 
from poor deblending), and (b) various noise statistics such 
as the median and the mean pixel value for those pixels not 
included in objects (which can differ significantly if there 
is some very large bright object in the postage stamp that 
did not get properly masked) . Imposing cuts based on these 
statistics reduces the size of our sample from 26 113 to 25 527 
postage stamps, a decrease in sample size of 2.2 per cent. 

If the masked objects are sufficiently close to the central 
galaxy, then in the SDSS it will not be possible to distinguish 
between them. This fact will allow us to quantify the level of 
undeblended projections in our shape catalogue: we identify 
those cases for which the apparent size of the galaxy in the 
SDSS is significantly larger than its counterpart in the PSF- 
matched images from COSMOS. 



6.3 COSMOS PSF estimation 

In order to remove the effects of the COSMOS PSF, we 
must determine the COSMOS PSF at the galaxy position. 
We follow the sam e proced ure as in iLeauthaud et al. (20071 ') 
and iMassev et al.l (|2007al ). who use PSF models from a 
modification of version 6.3 of the Tiny Tim ray-tracing 
progranv^l. These models represent PSFs for different pri- 
mary/secondary separation, since that separation is the 
main determinant of the PSF ellipticity. They are known to 
be a bit too small because they neglect the ' red halo' that 
enlarges real HST PSFs at long wavelengths i|Sirianni et al.l 
!l998), possibly due to backscattering off the front surface of 
the CCD. We quantify whether the deviation between these 
models and the real stellar images represents a problem for 
using the models to simulate ground-based data in Sec. 18.41 
Our procedure is to use the previous determination 
(|Leauthaud et all 120071 ) of the primary/secondary separa- 
tion for each exposure based on the ellipticity of ~ 20 bright 
stars, then to use that particular Tiny Tim model extrap- 
olated to the CCD (a;, y) position of each galaxy. The esti- 
mated Tiny Tim PSF for each galaxy position is included 
with the data release associated with this work. 



6.4 Determination of target imaging properties 

In this section, we describe what information must be spec- 
ified about imaging conditions in the data that is to be sim- 



^* http : //www. stsci . edu/sof tware/tinytim/ 
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ulated. Here, we restrict ourselves to a general discussion; 
however, for the purpose of testing and demonstrating the 
code, we attempt to simulate the SDSS data in the COSMOS 
field. Appendix |X] contains details of exactly what SDSS 
pipeline outputs are used. 

PSF: The desired PSF for the simulated data must be 
provided as a postage-stamp image. It should be the desired 
PSF including the pixel response function; e.g., if a Gaussian 
PSF is of interest as a test case, the Gaussian should be 
integrated within pixels, rather than sampled at the pixel 
centres. 

Photometric calibration: The most basic case would 
be to simulate data using the FSl'iW magnitudes. To do 
so, the code requires the total flux normalization in counts, 
as determined from the COSMOS total galaxy magnitude 
and the flux normalization for the ground-based survey of 
interest. 

Noise level: The default noise model is a spatially con- 
stant, uncorrelated Gaussian random field. For most ground- 
based surveys, the sky level is sufficiently high that the Pois- 
son noise due to the sky is effectively Gaussian; and most 
galaxies used for weak lensing are sufficiently faint that the 
sky noise dominates over the noise due to the galaxy fiux. 
Thus, the code simply requires a single noise variance to 
work in this basic mode. However, it also allows the choice 
of Poisson noise, and the user may optionally input a gain 
in order to also add the noise due to the galaxy flux (im- 
portant for relatively bright galaxies). Future versions of the 
code may allow the user to simulate a correlated noise fleld 
with a user-defined noise power spectrum, which will be im- 
portant for assessing the impact of correlated noise on shear 
estimation. 

The simulated postage stamps have a sky level of zero. 
Any constant or varying sky should be added by the user 
after running SHERA. 



7 IMPLEMENTATION 

For compatibility with many common astronomical image 
manipulation packages, we have implemented this simula- 
tion method in IDL. The data release includes a detailed 
description of the code options; here, we limit ourselves to a 
basic description of how the code carries out the procedure 
from Sec. \5\ 

The SHERA code operates on a set of input postage 
stamps: the original COSMOS image and PSF, and a target 
PSF to which we want to match. The manipulation of the 
images to create the simulated galaxies is performed using 
double precision arithmetic. 

The first step in the image manipulation is to change the 
sizes of the input postage stamps of the COSMOS galaxy, 
COSMOS PSF, and target PSF due to several considera- 
tions. When doing the Fourier space manipulations, it is 
convenient to have the ratio of the COSMOS and the target 
PSF postage stamp sizes be equal to the ratio of the tar- 
get and COSMOS pixel sizes, which is 7?pix = 0.396" /0.03" 
in the case of SDSS simulations (however, the code allows 
for nearly arbitrary choice of target pixel size, such that 
i^pix > 1). Likewise, it will be most convenient when work- 
ing in Fourier space if the COSMOS PSF and COSMOS 
galaxy postage stamp sizes are equal. 



Thus, we begin by padding the arrays until they achieve 
the appropriate size ratios. While the default is to pad with 
zeros, we also provide the option of padding with a realistic 
COSMOS noise fleld. Once we have a target PSF postage 
stamp of size iVT x Nt and COSMOS galaxy and PSF 
postage stamps of size A''c x A''c, with A^'c = -RpixAT, we 
can proceed with the analysis. To begin, we renormalize the 
flux in the PSF postage stamps so that the sum of the flux 
in all pixels is 1. 

In the description that follows, we denote the observed 
galaxy images using I (with subscript C for their image in 
COSMOS and T for the simulation of the target dataset), 
and PSFs using G (again with subscripts to indicate which 
PSF). Thus the images we begin with are /c, Gc, and the 
target PSF Gt- All three images are Fourier-transformed 
using the IDL routine fft, after which we multiply them 
by Nq or N-^ for proper normalization. The result of the 
Fourier transform is a double-precision complex array with 
the same dimensions as the originalr^l 

With the Fourier-space COSMOS PSF Gc, we can now 
construct the pseudo-deconvolution kernel T{k). Unlike a 
pure deconvolution kernel, 1/Gc, T has an additional fac- 
tor that avoids division by small numbers (i.e., where the 
COSMOS PSF has erased most information). We deflne this 
factor as 



y(fe) = 



l4-|0.5/Gc(fe)po 



and thus T by 



T{k) 



Y{k) 
Gc(fe)' 



(32) 



(33) 



The Y factor has been chosen to be very close to 1 for all 
scales where |Gc| ^ 0.5, and zero when |Gc| ;$ 0.5, with 
a smooth and rapid transition between these two regimes. 
Thus, it approximates a pure deconvolution at wavenum- 
bers where such an approach is possible, and removes all 
power at smaller scales. In practice, comparison with Fig. [2] 
demonstrates that this kernel gives a pure deconvolution for 
all scales within a factor of 2 of the SDSS band limit. 

The pseudo-deconvolved image (Eq. I26p can then be 
formed directly by multiplication of the elements of the two 
arrays at a given fc, using 



p{k) = f{k)ic{k). 



(34) 



The PSF for this pseudo-deconvolved image is simply 
Eq. (|32p . Examination of these pseudo-deconvolved images 
in real-space suggests that they very frequently include some 
ringing, always at higher wavenumbers (smaller scales) than 
the band limit of any reasonable ground-based PSF. In prac- 
tice this ringing is not relevant, since we do not work explic- 
itly with the real-space pseudo-deconvolved images, and the 
step of convolving to match a ground-based PSF will remove 
the ringing. 

At this stage, in order to reduce the effects of shape 
noise, we also deflne a 90 degree rotated galaxy image. Since 



^^ While this resulting array would seem to have twice as much 
information as the original real-space arrays, in fact the real part 
of the result is even and the imaginary part is odd, so the amount 
of information is preserved. 
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we would like this image to be rotated before shearing or 
applying the target PSF, we rotate P by 90 degrees about 
its central pixel to create P^''°^\ and note that its effective 
PSF is the 90 degree rotation of Eq. (f32l) . 

The next step is to shear both P and p(''°''. As de- 
scribed in Section [SI the shearing can be carried out simply 
in Fourier space using Eq. ((28}. We take advantage of the 
fact that the coordinates before and after shearing are lin- 
early related to each other, and use the IDL routine poly_2d 
that is designed to perform polynomial warping of images 
with various interpolation methods. For the level of accu- 
racy that we wish to achieve, we require the most precise 
(and time-intensive ) interpolation allowed by t hat routine, 
cubic interpolation ( Park fc Schowengerdtll983h . This inter- 
polation method approximates sine interpolation, which is 
in principle exact if the image is Nyquist sampled. Instead 
of the sine function, this routine uses cubic polynomials to 
make a function that is very similar, and that goes to zero 
(and has a derivative that goes to zero) at the edge of the 
window used for the interpolation. Since the arrays we want 
to shear are complex, we separately interpolate the norm 
and phase using this routine in order to reconstruct the 
sheared and pseudo-deconvolved images P'^' and p(''°*'T). 
In Sec. 18.11 we will present tests demonstrating that the in- 
terpolation routine we have used is sufficiently accurate for 
our purposes. 

The final Fourier-space manipulation is to match the 
desired target PSF by constructing a kernel Kt, or K^° 
for the 90 degree rotated image. To do this, we must divide 
the target PSF Gt by the effective PSF for the pseudo- 
deconvolved image (Eq. [32]or its 90 degree rotation). Once 
we have the matching kernel, the PSF-matching is then per- 
formed in Fourier space via multiplication of each element 
of P^"'' and p{i'ot,7) Yyy ^jjg corresponding element of Kt'- 



f(7). 



d{7) 



r^^'ik) ^P^^>{k) xKrik) 



and 



f(rot,7) 



(fc) 



p(™*'^Hfc) x^4"*'(fc)- 



(35) 



(36) 



The transformation to real space is again carried out us- 
ing fft. The real part of the resulting double- precision com- 
plex array is taken (in practice, the imaginary part, which 
should be precisely zero, is very small but nonzero due to 
negligibly small numerical inaccuracies) . To achieve the tar- 
get pixel scale, these images are resampled, which requires 
interpolation since the ratio of the pixel sizes is not neces- 
sarily an integer. For this purpose, we use the IDL routine 
congrid with cubic interpolation (the same interpolation 
used for the shearing, which approximates a sine function). 
The Fourier-space PSF-matching procedure implicitly ac- 
counts for the different pixel response functions, which is 
why we resample rather than rebinning the images. 

At this point, the total number of counts in the image is 
renormalized, and noise is added as desired. The total pro- 
cessing time per typical galaxy is approximately 1 second. 
For the purpose of our basic testing, we save four images 
per galaxy: the original orientation and the 90 degree ro- 
tated orientation, both before adding noise and after adding 
noise. For the purpose of the discussion that follows, we will 
refer to these as 'noiseless' (ignoring the low level of noise 
from the original COSMOS data) and 'noisy,' respectively. 

An example of how this processing works for one par- 



ticular COSMOS galaxy is shown in Fig. |4l which shows 
the original COSMOS image (with some fine structure), and 
the degraded SDSS image without and with a gravitational 
shear of 71 = 0.1, i.e. along the horizontal axis. 



8 TECHNICAL VALIDATION OF SHERA 

This section includes the results of several tests of the tech- 
nical aspects of SHERA, to demonstrate that the procedure 
outlined in Sec. [7] works cis intended. 



8.1 Accuracy of interpolation 

While most of the mathematical operations carried out by 
SHERA are simple and easy to carry out to extremely high ac- 
curacy (e.g., a Fourier transform), two of the operations are 
non-trivial because they involve interpolations. As described 
in Sec.[71 we use the IDL cubic interpolation routine both for 
shearing the pseudo-deconvolved images, and for resampling 
the PSF-matched images to the target pixel scale. 

Here we present the results of tests that demonstrate 
that the IDL cubic interpolation routine is sufficiently ac- 
curate for both shearing and resampling. While we carried 
out numerous tests of the pipeline using both real galaxies 
and analytic models, here we focus on tests that use the real 
COSMOS galaxies, under the assumption that they provide 
a more stringent test than analytic galaxy and PSF profiles. 

The first test is of the shearing of the pseudo- 
deconvolved image. In principle, we could carry out this 
test by transforming the pseudo-deconvolved image to real- 
space both before and after shearing, and then comparing 
the observed adaptive moment matrices. Since there is no 
PSF in these images, we can simply check that the mo- 
m ents transform under shea r according to equation (2-13) 
of iBernstein fc JarvisI (|2002h . However, as stated in Sec. [71 
the pseudo-deconvolution leads to ringing in real-space. The 
ringing is relatively high frequency and therefore difficult to 
accurately interpolate, which might lead us to conclude that 
our shearing is not very accurate. However, this ringing does 
not affect the accuracy of the shearing on the final ground- 
based image since convolving with the ground-based PSF 
will eliminate the ringing. Thus, we restrict our tests of the 
pseudo-deconvolved images to the Fourier-space images used 
for the actual shearing, and c heck that the Fourie r counter- 
part of equation (2-13) from [Bernstein fc JarvisI (|2002f) is 
satisfied. 

For this test, we use a random 5 per cent of the COS- 
MOS images, and apply a shear (71,72) ~ (0.02,0) before 
matching to SDSS. We do not add noise to the images, in or- 
der to allow for the highest possible precision in these tests. 
The motivation behind applying zero shear in one compo- 
nent is that it allows us to test that shearing one component 
does not lead somehow to spurious shearing in the compo- 
nent that we do not intend to shear. For each galaxy, we 
compute the adaptive moments of the pseudo-deconvolved 
Fourier space image before shearing (ei, 62) and after shear- 



ing {e\ 



{7) „{7)^ 



We then compare the latter with the ex- 



pected ellipticities {e^ 
served shear Ayi = e. 



(7) 



0(7) 



exp5 '^2,cxp,' 5 



to get the error in the ob- 
el '' for i = 1,2. We can use the 
values for this random subsample of the COSMOS galaxies 



(7) 



„(7) 
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Figure 4. Example of how the SHERA processing changes the galaxy images. Top left: The original galaxy image in COSMOS, on a 
linear scale. Top right: The appearance of the galaxy in SDSS after PSF-matching, before adding levels of sky noise consistent with 
SDSS. Bottom right: Same as bottom left, but with a significant shear in the horizontal direction, 71 = 0.1. Bottom left: Difference image 
between the sheared and unsheared simulated image, normalized by the unsheared image. Because the differences are at most a few tens 
of per cent, they are difficult to pick out visually by comparing the top and bottom right images. 



to study the distribution of A71/71 and A72. We find that 
this distribution is mildly non-Gaussian (with positive kur- 
tosis), and has a median value of A71/71 — 1.6 x 10^"" and 
A72 = 1.5 X 10~^. This result suggests that on average, when 
simulating a sample of > 1000 galaxies, the simulated shears 
are equal to the requested ones to extremely high accuracy. 
Moreover, the act of shearing one component does not lead 
to any significant spurious shear in the other component. 

However, we should also consider the width of the dis- 
tributions of A71/71 and A72. If the distribution is broad, 
then when simulating a few individual galaxies, there could 
be some systematic deviation from the desired shear value 
which does not average out as it would when simulating 
many galaxies. The ensemble 68 per cent confidence inter- 
vals are —0.0015 < A71/71 < 0.0032 (the median is not at 
the centre of this range because it is a skewed distribution) 
and I A72I < 3.5 x 10~* (this distribution is not skewed, pre- 



sumably because no shear was actually applied). We there- 
fore conclude that for any individual simulated galaxy, (a) 
when applying a shear, there is a 68 per cent chance that 
the actual applied shear will be within [—0.15, 0.32] per cent 
of the desired shear, and (b) shear components to which we 
do not intentionally apply a shear remain unsheared at the 
level of a few xlO~*. Thus, the interpolation is sufficiently 
accurate to precisely shear the galaxies on average (that is, 
that there is no systematic problem with the applied shears) 
and even for single galaxies, the applied shears are correct 
at the level of a few tenths of a per cent. 

The other operation for which we must use interpola- 
tion is the image resampling. There may be a concern that 
the tests described above are not an adequate test of the in- 
terpolation for resampling, for the following reason: when we 
apply a (typically small) shear, the pixel grid is not highly 
distorted near the centre of the galaxy. This means that the 
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interpolation is being used to estimate values very close to 
being on the pixel grid, which should not be too difficult. 
However, when we resample to some arbitrary pixel grid, we 
might end up interpolating (even near the image centre) to 
some locations that are not close to lying on the pixel grid. 
We therefore require a test of the interpolation that samples 
the image in a more general way than the above test. 

Here we present a test of the images after pseudo- 
deconvolution, shearing, PSF-matching, and returning to 
real space - in other words, the actual point in SHERA where 
resampling takes place. The test is that instead of resam- 
pling the images, we apply a random rotation, and compare 
the change in the moments with the expected change. We 
know exactly how the galaxy ellipticities should transform: 



ei,rot = [cos(26'rand)]ei + [sin(26lrand)]e2 
e2,rot = -[sin(26'rand)]ei + [cos(26',.and)]e2. 



(37) 



An additional test is to ensure that the area implied by the 
adaptive moments {{M^xMyy — Mf^)^") is unchanged by 
rotation. 

When we carry out this test, we find that there is a non- 
zero but extremely small systematic change in the elliptic- 
ities (compared to the expected ellipticities after rotation): 
typically —1.3 and -1-2.6 x 10~^ for ei and 62, respectively. 
We also find a tiny but statistically significant change in 
the areas implied by the adaptive moments, at the level of 
—8 X 10~^. These numbers are sufficiently small that they 
are truly subdominant to other systematics issues that arise 
in any realistic lensing analysis. 



8.2 Simulated galaxies compared to real data 

Another important test for SHERA is to compare the simu- 
lated SDSS images of COSMOS galaxies with the real SDSS 
images of those galaxies. Here, we will describe several tests. 



8.2.1 Galaxy numbers 

As described previously, the sample of galaxies for which we 
have generated COSMOS postage stamps contains 26 113 
galaxies (Sec. I4.1|l . This number was reduced to 25 527 
(Sec. I6.2|l when we require that the postage stamps go suc- 
cessfully through our postprocessing (to mask out additional 
objects, etc.) and then to 17 706 when we require that the 
galaxies lie in regions considered photometric in the SDSS 
imaging fSec. lA3|l . 

There is one additional cut that we must impose after 
convolution to match the SDSS PSF. This cut relates to the 
fact that the original postage stamp sizes were estimated 
based on a circularly- averaged characteristic radius, which 
means that for very large and flattened galaxies, some flux 
might go off the edge of the convolved postage stamp in the 
direction of the galaxy major axis. To test for this issue, we 
processed all of the 'noiseless' simulated images, compar- 
ing the flux in (a) the pixel with the maximum flux, versus 
(b) the pixel with the maximum flux when considering only 
those pixels at the very edge of the postage stamp. We then 
eliminated those galaxies for which the latter was > 0.01 
times the former. This left us with a flnal galaxy sample 
for all tests of 17 667 galaxies (some of them too faint to 
measure in the SDSS images). 



8.2.2 Comparison with real data 

The flrst comparison we make is between the simulated im- 
ages (without shear or 90 degree rotation) and the actual 
SDSS images, for those that are detected. In principle, the 
images should be the same except for noise and the centroid- 
ing of each object within the central pixel (which we have 
made no attempt to match). 

We begin by comparing the results for the noiseless sim- 
ulations against those with the original SDSS data. In this 
case, there are 9469 galaxies that have measurable galaxy 
shapes (with re-Gaussianization) both in the real data and 
in the simulations; of those, 6361 pass our resolution cuts 
in both cases. We restrict the comparison of shapes to that 
sample of 6361, which should be fairly similar to the source 
catalogue except without a cut on magnitude, which would 
remove another ~ 30 per cent of the galaxies. The results are 
shown in Fig. [51 which shows that for both the observed el- 
lipticities (ei, 62), and for the PSF-corrected (ei^corr, £2, con), 
the median trend is for the simulated values to be equal to 
the ones in the real data (modulo noise, which tends to cause 
scatter in the vertical direction, since it is present in the real 
data but not in these simulations) . This finding suggests that 
the simulation pipeline is indeed providing realistic data. 

The apparent exception to the close comparison be- 
tween simulated and real data is the R2 comparison, which 
shows a distinct trend towards better resolution in the real 
data than in the simulations. There are two causes for this 
finding: the first is that we are imposing the R2 cuts in dif- 
ferent ways in the simulated and real data, since the former 
lacks noise. This results in a form of Malmquist bias, given 
that we impose the R2 cut on the 'true' resolution in the sim- 
ulation but on the noisy R2 in the real data, which means 
that at the low resolution end of the sample, there is an 
induced bias when comparing the noisy versus the noiseless 
results. Indeed, this bias essentially vanishes if we make the 
same plot using the R2 for the simulations with added noise. 

However, even ignoring the very low resolution end, we 
can see that there is a weak (few per cent) tendency for 
galaxies to be scattered preferentially towards the upper left 
part of the plot. Detailed examination of some galaxies in 
that part of the figure suggests that while some are noise 
fluctuations in the real data, others are due to deblending 
failure. There are multiple nearby objects in reality, which 
are resolved in COSMOS, so that the additional objects be- 
sides the central galaxy were masked in our processing of the 
postage stamps; but the blend was not resolved in SDSS and 
hence was treated as one larger object. A careful study of 
these simulations can therefore be used to study deblending 
failures in ground-based data. 



8.2.3 Comparison between original and rotated 

As an additional sanity check, we also show a comparison 
between the shapes for the original and the rotated images, 
in the case of no added shear, but with added noise. For 
the sake of simplicity, we show only one shear component; 
results for the other are comparable. Here, we rely on the 
fact that 

• the intrinsic shapes should be the opposite of each 
other, i.e. eorig,int = — Crot.int; and 
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Figure 5. Basic shape comparison for real SDSS data versus (noiseless) simulated data, with points for each simulated galaxy and trend 
lines indicating the median (solid) and 68 per cent CL (dashed). For comparison, the 1:1 line is shown in all cases. Top left: Observed 
ei shape component (along the pixel direction) of the PSF-convolved galaxy image in the simulation without added noise versus in the 
real data. Top right: Same as left, but after PSF correction. Middle row: Same as top row, but for the 62 ellipticity component. Bottom 
row: Galaxy resolution compared to the PSF in the simulation versus in the real data. 
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• any systematic additive component to the shapes from 
the PSF should be the same, i.e. Corig.sys ~ erot.sys. 

As a rule these additive systematic components will not can- 
cel out over all the galaxies, because of the tendency for there 
to be a coherent PSF ellipticity in any given field that re- 
sults in the systematic components of the galaxy ellipticities 
having the same sign. 

Thus, when we plot Corig versus — Crot, we should find 
that before PSF correction, the results are offset from the 
one-to-one line (but are parallel to it), and the results are 
returned to the one-to-one line after PSF correction. These 
results are shown in Fig. [51 and are entirely consistent with 
our expectations. 



8.3 The need for pseudo-deconvolution 

Before using our simulations to test the impact of pseudo- 
deconvolution (removal of the ACS PSF) , we first consider 
the basic reasons why it may be important to include in a 
simulation pipeline that is meant to be able to accurately 
simulate ground-based data with a wide range of observing 
conditions. 

Though we have focused on the ACS in this paper, the 
importance of pseudo-deconvolution is likely generic to all 
space telescope data (including other HST cameras) since 
it is a feature of diffraction patterns. While the core of the 
PSF has a radius of ~ Sd = \/ D = 0.07" for ACS/F814VK 
(where A is the wavelength of observation and D is the outer 
diameter of the telescope), the diffraction rings contain a sig- 
nificant amount of power. For example, an Airy disc scat- 
ters a fraction ~ 2-k^^9y)/9 of the light to radii > 9 (for 
9/9y) ^ 1). Thus for a galaxy with scale radius ^ fe, the 
effect of the PSF on observed galaxy properties scales in 
proportion to ^d rather than 9^ as one would expect based 
on Gaussian approximations or second moments. An equiv- 
alent effect can be seen in Fourier space: G(fc) for an Airy 
disc has the leading behavior l — 2it~ fc|fe|-|-. . . rather than 
having the first nontrivial term be k^ . Thus diffraction even 
by a large aperture (small Sd) has an effect even for long- 
wavelength features in the image. 

We can see one manifestation of this effect in Fig. [21 
which shows the azimuthally-averaged Fourier space repre- 
sentation of the ACS PSF. While the ACS PSF is always 
above the SDSS PSF in that plot, indicating that the ACS 
PSF preserves more information than the SDSS PSF at all 
values of wavenumber k, it is nonetheless the case that it is 
tens of per cent below 1 at the band limit of the SDSS PSF, 
primarily because of the large-scale impact of the diffraction 
rings. 

The above argument provides the justification for in- 
cluding pseudo-deconvolution as part of SHERA, given our 
intention that it should be useful for simulating ground- 
based imaging data under a wide range of observing con- 
ditions. However, it is not immediately obvious, for the case 
of data with typical seeing of ~ 1.2" such as SDSS, that the 
pseudo-deconvolution is necessary for accurate image simu- 
lations. Here, we address this question using simulations for 
which the pseudo-deconvolution was not performed. That 
is, instead of removing the ACS PSF on scales that we will 
want to use from the ground, shearing, and then finding a 
matching kernel to the ground-based PSF, we simply shear 
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Figure 6. Shape comparisons for the galaxy images in their orig- 
inal orientation, and with a 90 degree rotation (using the same, 
non-rotated PSF). Top: Histogram of PSF ellipticities in the COS- 
MOS field, which shows a coherent tendency towards ei < 
and 62 < 0. Middle: Observed 62 shape of the PSF-convolved 
galaxy image in the original and rotated simulation images. Bot- 
tom: Same as middle, but after correction for the effects of the 
PSF. 
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and then convolve the data directly with the SDSS PSF, ig- 
noring the ACS PSF en tirely. This procedur e was used for 
the STEP2 simulations (iMassev et al.ll2007al '). 

The results of comparing these simulations without 
pseudo-deconvolution, to the regular SHERA simulations 
without added noise, are shown in Fig.[71 Statistical analysis 
reveals the same trends in the data with noise, however they 
are less visually apparent. 

As shown in the top panel of Fig. [T] when we compare 
individual galaxy ellipticities in the SHERA simulations and 
those simulations without pseudo-deconvolution, the latter 
tend to be rounder. The effect is, unsurprisingly, worse at 
large ellipticity. In the middle panel, we can see that the 
observed sizes are also affected: the galaxies appear to be 
larger compared to the SDSS PSF than they do in the sim- 
ulations that account for the ACS PSF. As expected, the 
trend is more important for less well-resolved galaxies. Fi- 
nally, we can see on the bottom that there is a concrete effect 
on the histogram of total ellipticities, with the trend sug- 
gested in the top panel that the simulations without pseudo- 
deconvolution yield a rounder galaxy population overall. 

As a consequence of the overall rounder galaxy popula- 
tion, the inferred galaxy RMS ellipticity erms is ~ 0.31, in 
contrast to the finding from simulations including pseudo- 
deconvolution (Fig. I10[) that it is ~ 0.36. Moreover, we 
find that the inferred shear changes slightly, becoming less 
negative by 1 per cent. Given the evidence that the PSF- 
matching is important in determining the observed galaxy 
properties at the ~ 5-10 per cent level, and Sec. l9.2l showed 
that the shear calibration for re-Gaussianization depends on 
galaxy properties, the different derived shear calibration is 
likely due to the fact that not deconvolving the ACS PSF 
is equivalent to simulating a slightly larger, rounder galaxy 
population. This fact does not invalidate the utility of the 
STEP2 simulations for basic testing of PSF-correction algo- 
rithms; it simply means that to constrain shear calibrations 
in some hypothetical ground-based dataset to better than 
1 per cent for this particular galaxy population, one should 
use simulations that (a) more closely mimic the imaging con- 
ditions at that particular telescope, (b) include other steps 
in the image processing, such as the need to combine multi- 
ple exposures, and (c) include pseudo-deconvolution to more 
faithfully represent the intrinsic properties of the galaxy 
sample. 

The effects that are described in this section will be 
more important for several limiting cases: (1) ground-based 
images with better resolving power than SDSS, such as Sub- 
aru Suprime-Cam (for which the median seeing is 0.7" and 
the pixel size 0.2", both numbers a factor of ~ 2 smaller than 
for SDSS) and (2) ground-based images that are deeper, in- 
cluding more galaxies that are intrinsically small and faint. 
We have explicitly tested the first scenario, using a typical 
Subaru Suprime-Cam PSF, and found that the effects of 
ignoring pseudo-deconvolution on the observed galaxy sizes 
are nearly twice as large as for SDSS. Thus, for all upcoming 
surveys, image simulations that rely on space-based data to 
precisely calibrate shears must account for the PSF if they 
want to simulate a realistic galaxy population using SHERA. 
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Figure 7. Comparison of SHERA simulations versus those that 
do not include the pseudo-deconvolution of the ACS PSF. Top: 
Scatter plot comparing the values of one ellipticity component, 
ei , for each galaxy. The solid line shows the 1 : 1 line; the dashed 
line shows the best-fit line, refiecting a factor of 0.9 multiplicative 
offset. Middle: Scatter plot comparing the values of resolution 
factor, R2, for each galaxy. The dashed best-fit line reflects a 
~ 15 per cent offset for the poorly-resolved galaxies. Bottom: 
Histogram of etot = (cj + e^)^'^ values in the two simulation 
sets. 



© 0000 RAS, MNRAS 000, 000-000 



18 Mandelhaum et al. 



Impact of using Tiny Tim rather than 
observed PSFs 



8.4 



As pointed out in Sec. 16.31 there are some known issues 
with the Tiny Tim PSF s th at are used for PSF co rrection in 
iLeauthaud et al.l (J2007l ) and lMassev et al.l l|2007ah . and that 
are used in this work and included with the associated data 
release. Consequently, we must estimate the impact of using 
them rather than real stars (from dense stellar fields, with 
the same primary/secondary separation) when performing 
the pseudo-deconvolution step before matching to the target 
ground-based PSF. 

For this test, we used a random subsample of COSMOS 
galaxies, and compared the observed resolution factors (-R2) 
with respect to the SDSS PSF when we used SHERA with the 
same input parameters, only varying which COSMOS PSF 
was used for the pseudo-deconvolution. We found that for 
well-resolved galaxies, the R2 value was 0.4 per cent larger 
when using the real PSF stars than using the Tiny Tim mod- 
els; at our lower resolution limit, they were typically 1.2 per 
cent larger. For context, we saw in Fig. [7] that at the res- 
olution limit, if we did not pseudo-deconvolve but instead 
ignored the ACS PSF entirely, the resolution factors in the 
simulated data differed by 16 per cent. Thus, crudely speak- 
ing, pseudo-deconvolution using the Tiny Tim PSF models 
effectively removes > 93 per cent of the impact of the ACS 
PSF in the final simulated images. 

The practical barrier at this time to simply using the 
stellar images for pseudo-deconvolution is that the stellar 
fields are typically not deep enough to get a very high S/N 
PSF estimate on a per-star basis; the first diffraction ring is 
barely, if at all, visible when looking at a single star. Thus, a 
reliable PSF interpolation routine would be necessary to fit 
for a high S/N PSF model as a function of CCD position, 
which includes a non-negligible amount of development and 
testing to validate it. While such development is ultimately 
necessary for very high precision tests using SHERA, we defer 
it to future work. 



8.5 Impact of noise in original COSMOS images 

The noise in the original COSMOS galaxy postage stamps is 
sheared and convolved with the target PSF. In this section, 
we concretely demonstrate the impact of that low noise level 
on the simulated images. 

To carry out this test, we take a subset of the simulated 
SDSS images, and do the following operations: 

• We start with the CTI-corrected and cleaned galaxy 
postage stamps from Sec.|6] but in addition to masking out 
all additional objects in the postage stamp, we also mask 
out the central object. This gives us a standard galaxy-size 
postage stamp but with only a correlated noise field, no real 
objects. 

• We process it with a modified version of SHERA using 
the same target PSF as when simulating the real SDSS data, 
including forcing the code to use the same normalizing fac- 
tors for the flux as when making the simulations that do have 
the galaxy present. This ensures that the normalization of 
the resulting sheared, PSF-convolved noise field is the same 
as that in the standard galaxy simulations. To make it eas- 
ier to detect systematic effects, we impose a relatively large 
shear, 7input = 0.1. 



• We then add the resulting sheared, correlated noise 
fields to the (shear-free) galaxy simulations used in Sec. 18.21 
without and with noise added to match SDSS. 

We compare the original noiseless simulations versus 
those that have the new correlated noise fields added, 
constructing Aei, and Ae2 (for the PSF-corrected galaxy 
shapes) . This comparison reveals systematic offsets in prop- 
erties due to the correlated noise fields that result from run- 
ning a COSMOS noise fleld through SHERA. The results, 
with different input shear values (7input) for the noise fields, 
are consistent with 



A7 

'Yinput 



0.01. 



(38) 



As a practical consequence, this means that the original 
noise in the COSMOS images, when sheared, does not affect 
our ability to test shear recovery to the per cent level. How- 
ever, if we wish to constrain shear calibration to well below 
the per cent level, then this effect (in addition to the effects 
in Sec. 18. 4|) must be accounted for. We defer consideration 
of this issue to future work. 



9 RESULTS: SHEAR CALIBRATION 

In this section, we present one example usage of the SHERA 
pipeline to assess the calibration of lensing shear measure- 
ments using the re-Gaussianization PSF correction method. 
For these tests, we chose 14 sets of (71,72) shears to 
simulate; these are shown in the top panel of Fig. [S] For 
each of the 17 706 galaxies used for the simulations to val- 
idate the pipeline fSec. 18.2]) . we used SHERA to generate 56 
additional simulated galaxies: 14 shear sets, 2 noise options 
(noiseless and noisy), and 2 orientations (original and 90 de- 
gree rotated). We then select galaxies in various ways (to be 
described shortly), and defined a weight function for each 
galaxy: 



WCOSMOS.i 



(39) 



Here the numerator wcosmos.i is the inverse of the postage 
stamp selection function (Sec. 14.11 and Fig. [1} meant to re- 
move our selection bias against physically large galaxies. The 
denominator only is significant for shear estimates using the 
simulations with sky noise, since the shape measurement 
error is negligible for those without added noise. For typ- 
ical galaxy-galaxy lensing analyses, there is an additional 
factor in this weight function: E^^, which corresponds to 
optimal weighting in the case that lens and source redshifts 
are both known. For the simulations, we cannot easily in- 
clude such a factor, because simulating the SDSS photo-z 
would require simulating ugriz data and processing it with 
the SDSS Photo pipeline. 

To estimate the shear, we then defined 



7q 



4Ssh l]i Wi 



(40) 



in terms of the PSF-corrected shapes, for shear components 
a — 1,2 and galax;ies i. The Ssh factor, or shear responsivity, 
represents the response of our particular ellipticity definition 
Eq. ((9| to a shear; it is equal to 1 — Brms- 
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9.1 Without sky noise 

We first present results using tlie 'noiseless' simulations, 
which we expect to do with very small statistical errors 
since there is essentially no measurement error, and the use 
of original and rotated shapes should effectively eliminate 
the shape noise. In order to select an approximately reason- 
able galaxy population for this test, we use the COSMOS 
F814:W magnitudes and the typical SDSS colours to impose 
an approximation of the r < 21.8 cut (corresponding to that 
for the real source catalogue). We also require a resolution 
factor R2 > 1/3 for both the original and rotated galaxy; 
this results in a sample of 6 160 galaxies. The comparison 
between the estimated shears (71,72) and the true ones for 
each of the 14 simulations with shear is shown in Fig. [S] 

As shown, there is a clear detection of non-zero calibra- 
tion bias and additive PSF systematics. When fitting to 



7a 



■ya = rUa 7a + Cq 



(41) 



both the slope (calibration bias) m and the additive constant 
c differ from zero. The calibration biases are — 1.6 ± 0.1 and 
— 2.7± 0.1 per cent for the two shear components. The fact 
that these biases differ for the two components, and that the 
latter is worse than the former, is consistent with results of 



iHigh et all (|2007D and iMassev et al.l (|2007al ). The standard 
explanation is that the pixel resolution is effectively a factor 
of y/2 worse along the diagonals of the pixels than along the 
pixel direction. If we remove the weighting in Eq. H39p . then 
the calibration biases change by 0.1 per cent, the size of the 
Icr error. The sign and magnitude of this change can be ex- 
plained by the weak but non- negligible correlation between 
the galaxy weights iocosmos (to account for the inability to 
create postage stamps for some of the larger galaxies) and 
the galaxy size, given the trends we will see in the calibration 
bias with galaxy size in Sec. 19.21 

The nonzero additive contamination (c values) can be 
explained by the nonzero average PSF ellipticity, which 
is imperfectly removed from the galaxy images by re- 
Gaussianization. For context, the typical PSF ellipticity 
in SDSS is ~ 0.05, so the fractional contamination is 
|ci/ei,PSFi ~5 X 10"^ 



9.2 Dependence on sample properties (noiseless) 

In figure 5 of iMandelbaum et al.l (|2005l ). predictions 
were shown for the shear calibration bias for the re- 
Gaussianization method using noiseless simulations, for ex- 
ponential and de Vaucouleurs galaxies, using a Kolmogorov 
turbulence-induced profile, i.e. InG'(fe) ex —k^'^. As shown 
there, the calibration biases depend on the galaxy profile, 
being more negative for de Vaucouleurs profiles than for ex- 
ponentials; on the resolution factor, being more negative at 
intermediate resolutions {R2 ~ 0.6) and closer to zero for 
resolutions at the lower (1/3) and upper (1) limits; and on 
the intrinsic ellipticity, being more negative for more intrin- 
sically circular galaxies. We now test all of these predictions, 
again in the case without measurement noise. 

To carry out these tests, rather than computing an en- 
semble 7 to compare with the input value via weighted sum- 
mation over all the individual shear values, as in Eq. (|40p . we 
instead consider individual shear estimates for every single 
galaxy. To estimate a shear for each galaxy, we use the orig- 
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Figure 8. Results of shear calibration tests for the re- 
Gaussianization PSF-correction technique, using the 'noiseless' 
COSMOS simulations with flux and resolution cuts. Top: The 
true input shear values, and the estimated ones. Middle: The er- 
ror in recovered shear component 1, 71—71, as a function of the 
input shear. The best-fitting line is also plotted. Bottom: Same 
as middle, but for the other shear component. 
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inal and rotated shape measurements for each component 
(4 meas urements), and write four equations based on eq. 
(2-13) of iBernstein &: Jarvid (|2002l ) to express those observ- 
ables in terms of the intrinsic shape and the applied shear 
(4 unknowns). These equations are nonlinear functions of 
the intrinsic shape and applied shear; we solve this nonlin- 
ear system of equations using the IDL implementation of 
Broyden's method to estimate 7a, i (for shear component a, 
galaxy i). 

Given the individual shear estimates, we can compute 
the fractional error in each one, (7a, i — 7q)/7q in bins in 
galaxy properties. The properties used for this test are the 
resolution factor R2; the total ellipticity etot = yef + ef; 
and the Sersic rig deriv ed from single comp onent fits to the 
COSMOS images fromlSargent et al.1 (|2007l ). using GIM2D 
l|Marleau fc Simardlll998l ). The results are shown in Fig. H 

As shown in the top panel, we find a calibration bias 
that approaches zero for the lowest and highest resolutions 
in our sample, and goes as low as —4 per cent for the galaxies 
at i?2 ~ 0.7. This finding is qualitatively similar to tha t from 
the noiseless simulations of iMandelbaum et al.l (|2005l ). with 
the difference being that these results intrinsically average 
over a more realistic galaxy population (all morphologies 
rather than single-component Sersic profiles with Us = 1 
and 4) and real PSFs. 

The middle panel of Fig. [5] shows that the calibration 
bias tends to be most negative (~ —3 per cent) for galaxies 
that are nearly circular, and increases to for |e| ~ 0.6, be- 
coming slightly positive at even higher ellipticities (where, 
however, there are very few galaxies and therefore the sta- 
tistical significance of the trend above | e| > 0.6 is unclear). 
This t rend is again similar to that from IMandelbaum et al.l 
l|2005h . 

The bottom panel of Fig. |9] shows the trend with 
Sersic n^, which gives a calibration bias that is most nega- 
tive for low Us (exponential galaxies) and is closer to zero 
for higher Us- Thi s find ing is the opposite of that from 
IMandelbaum et al.l (|2005l ): however, it is important to note 
that in this case the Us values are not exact represen- 
tations of the real galaxy morphologies, given that most 
galaxies show some deviation from a perfect elliptical Sersic 
profile (either having more small-scale structure, or being 
clearly composed of multiple components such as a bulge 
and a disk). Thus, it is unclear that the results shown here 
as a functi on of ris can truly be di rectly compared with 
those from IMandelbaum et al.l (120051 ). without first assess- 
ing which of the COSMOS galaxies are in fact consistent 
with a featureless Sersic model. Moreover, the statistical sig- 
nificance of the observed trend is fairly weak. 



Figure 9. Results of sliear calibration tests for the re- 
Gaussianization PSF-correction technique, using the 'noiseless' 
COSMOS simulations with flux and resolution cuts. Top: The 
fractional error in the recovered shear as a function of the galaxy 
resolution with respect to the PSF, R2, along with the histogram 
of i?2 values for the noiseless simulations. The points and solid 
line show the median trend; the dashed lines show the statistical 
uncertainty in the trend-line. Middle: Same as top, but as a func- 
tion of total ellipticity |e|. Bottom: Same as top, but as a function 
of Sersic Us as determined from the COSMOS images. 



9.3 Noisy simulations 

We now consider the shear calibration bias in simulations 
with sky noise. However, because the sky level in the SDSS 
imaging of the COSMOS field is atypically high (Sec ETj) . 
we use a set of simulations that are otherwise identical to 
the ones from Sec. 19.11 but with sky noise that is 15 per 
cent lower (in the standard deviation). The more typical 
noise level in these simulations makes them more like typical 
SDSS data. 

There is an important subtlety when using a sample 
that has noise to estimate shear calibration bias: we must 
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be very careful when selecting galaxies to use for the shear 
estimation. In Sec. 19. 11 we simply approximated an r < 21.8 
cut in order to get a roughly similar galaxy population as 
in the real data. Here, however, we know that galaxies that 
have a noise fluctuation such that they are harder to detect 
would realistically not be included in our sample, and that 
is a good thing given that their shears should be unreason- 
ably difficult to measure. In order to perfectly mimic our 
sample selection in the real data, we would have to process 
the simulated images in the exact same way as the real data, 
using Photo. However, there are other practical obstacles 
to constraining the shear calibration in all of SDSS using 
this simulation set (e.g., the fact that we have not sought 
to carefully mimic the distribution of observing conditions 
throughout the entire SDSS area). So, we instead use a sim- 
pler approximation of our selection in the real data, as a 
basic demonstration of the power of the SHERA code rather 
than as a quantitative estimate of the SDSS shear calibra- 
tion bias. 

Our crude selection relies on the estimated shear mea- 
surement errors a-, for each galaxy. The re-Gaussianization 
code estimates a-y (per component) by taking the in- 
put value of sky varianc e a^^^y from Eq. (|A2[) . and using 
iBernstein fc Jarvis|[20o3 ) 



2 



(42) 



U S/N ' 

where /„, is the weighted flux (by the adaptive moment ma- 
trix). This equation is only approximate, and assumes Gaus- 
sianity of the PSF-convolved galaxy image. 

Thus, our approach to object selection in these simula- 
tions is as follows: 

(i) We examine the real SDSS source catalogue to find 
what S/N is implied by the a-y values at our limiting mag- 
nitude r = 21.8. While there is a range of cr^ values at fixed 
magnitude, we find that r < 21.8 corresponds to a-, < 0.21 
(or S/N > 9.5). 

(ii) We impose that cr-y cut on the original and rotated 
simulated image^fj. In practice, for a given set of noisy sim- 
ulations, typically 4100 galaxies pass this cut, the resolution 
cut, and the etot < 2 cut. 

The first aspect of the shear estimation that we can test 
using the noisy simulations is the shear responsivity calcu- 
lation, which is in the denominator of our shear estimator 
Eq. (gOJ and is 



Ssh 



1 



(43) 



where the rms ellipticity per component is ideally defined as 
a weighted sum over the ellipticities of the source population, 



Y:^■wi 



given true, noiseless, ei and 62 values (the mean ellipticity 

^^ Technically, in the real data, we impose our S/N cut in r. How- 
ever, there is also a magnitude cut in i which, given the relation 
between the sky variances in the two bands and the typical galaxy 
colours, corresponds to a similar S/N cut in that band. Thus, re- 
quiring the magnitude and resolution cuts in r and i in the real 
data is parallel to our imposition of S/N and resolution cuts in 
the simulations for both the original and the rotated images. 
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Figure 10. The RMS ellipticity erms as a function of the ap- 
parent magnitude. Results are shown for the simulated data with 
noise, before (long-dashed) and after (solid) subtracting off our 
estimates of the shape measurement error due to photometric 
noise. The results are also shown for the same galaxies using the 
ellipticities from the noiseless simulations (dashed line). 



per component, (ei) = (62) = 0). In real data, we lack a 
noiseless estimator of ei and 62, so we must estimate Crms 
using our noisy estimated ei and 62: 



^2 

C-VJT- 



2ai 



E.w'' 



(45) 



Errors in the estimated erms from Eq. (|45p propagate into 
the shear estimates via Eq. (|40p . 

For our noiseless simulations, we come as close as possi- 
ble to being able to carry out a 'true' erms estimate, Eq. (|44p . 
In practice, these simulations do include a low level of noise 
from the COSMOS simulations, but we at least know that 
this will tend to increase the estimated emis; thus, the erms 
from the noiseless simulations serves as an upper limit on 
the true erms. Hence, we first estimate erms as a function 
of magnitude for the noiseless and the noisy simulations, to 
estimate how much the inaccurate a^ estimates might be 
biasing erms, Ssh, and the estimated shears. The results of 
this estimate are shown in Fig. [TD] 

As shown, the RMS ellipticity is close to flat with mag- 
nitude in the noiseless simulations [fj In contrast, in the noisy 
simulations, it appears to increase significantly at fainter 
magnitudes, even after our attempts to subtract off the mea- 
surement error. This result implies that our shape measure- 
ment errors are underestimated. Fig. \TU\ gives us a way to 
assess how much the estimated shear responsivities are in- 
correct due to our underestimated shape measurement error. 

^^ In an upcoming paper, Reyes et al. (2011, in prep.), we will 
present evidence that the deviations from flatness in these simula- 
tions are due to small levels of nonlinearity in the PSF correction 
that lead to non-Gaussian error distributions. 
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We find that for the noisy simulations, the responsivity is 
estimated as 0.84, whereas in the noiseless ones, it is 0.86. 
Thus, our responsivity is 2.5 per cent too low, so the shears 
are overestimated by this amount in the noisy simulations. 
In the text that follows, when we quote shear calibration 
biases for noisy simulations, they include this shear overes- 
timation due to shear responsivity error. 

The value of erma ~ 0.36 in Fig. [10] may seem in- 
consistent with that from the COSMOS da ta for bright 
galaxies, figure 17 of iLeauthaud et al.l (|2007|), which gives 
firms ~ 0.27. However, as described in Sec. 14.31 the shape es- 
timates in the se two works differ in use of circularly weighted 
(|Leauthaud e t al. 2007) vs. elliptical-weighted adaptive mo- 
ments (this work), such that we should find a larger erms in 
this work. While Eq. (|13p relating the two ellipticity mea- 
surements is only exactly valid in certain unrealistic limits 
(Gaussian profiles), we can nonetheless use it to estimate 
the effect of this different shape definition. If our simulation 
ellipticities are transformed using Eq. H13|) to those that are 
expected using RRG , then the resulti n g erm s = 0.28, quite 
similar to that from ILeauthaud et al.l ( 20071 ) . Residual dif- 
ference may be due to the cases where Eq. p3|) fails to relate 
the two ellipticities correctly. 

Next, we carry out a similar shear calibration bias cal- 
culation as in Sec. 19.11 but with the noisy simulations, using 
the G^ cut discussed earlier in this section to remove those 
galaxies that have low significance detections. We use two 
independent noise realizations of each galaxy pair to reduce 
the noise. In comparison with the noiseless case shown in 
Fig. [HI for which we had found ensemble calibration biases 
of mi = — 1.6±0.1 and m2 — — 2.7±0.1, the noisy case gives 
calibration biases of mi — —3.8 and 7712 — —4.3 ± 2.5 per 
cenllfj. This result is just 2 per cent worse than the noise- 
less case; however, the net —4 per cent calibration bias also 
includes a +2 per cent bias due to incorrect shear responsiv- 
ity, implying a —6 per cent bias due to insufficient dilution 
correction and noise rectification bias. 

We can compare these results with nois y simulations 
agains t those from the STEP2 simulations i Massev et al.l 
[2007a|). There are a number of reasons why we do not ex- 
pect the results to agree: for example, the fact that a deeper 
sample population was being simulated in STEP2, with in- 
trinsically different properties; the fact that higher resolu- 
tion Subaru PSFs and pixel scale were used in STEP2; and 
the fact that the ACS PSF was not pseudo-deconvolved be- 
fore convolving with the ground-based PSF. Nonetheless, 
we compare against the following STEP2 results: the shear 
calibration bias for re-Gaussianization (denoted 'RM' in 
JMassev et al.ll2007al ') was typically —2.5 per cent; and the 
shear calibration bias is more negative for fainter magni- 
tudes. Our — 4.0±2.5 per cent calibration bias is statistically 
consistent with the STEP2 results. Moreover, the fact that 
the calibration bias became more negative as we moved from 
noiseless to noisy simulations is qualitatively consistent with 
the STEP2 result that calibration bias is more negative at 



^^ When calculating unweighted sums over the galaxies, without 
the weight factor in Eq. II39I I. the calibration biases worsen by —2 
per cent, just under Icr. This difference reflects the fact that the 
bias is worse when we include galaxies near the flux limit, which 
get downweighted due to their larger Ce. 



fainter magnitudes (i.e., lower S/N). A detailed quantita- 
tive comparison is beyond the scope of this paper due to the 
many intrinsic differences between these simulations. 



9.4 Limitations of these results 

The shear calibration bias estimates in this section were in- 
tended primarily as a demonstration of one possible applica- 
tion (out of many) of the SHERA pipeline. Here we summa- 
rize why the specific numbers presented here should not be 
used as a precise estimate of the shear systematics in science 
papers using the SDSS re-Gaussianization shape catalogue: 

(i) In this section, we have averaged over all sources that 
are detected in the simulations. In practice, the source pop- 
ulation that is used depends on the lens redshift, since pho- 
tometric redshifts are used to (roughly) select those sources 
that are behind the lenses. 

(ii) When calculating signals for the real data, there is 
an additional weight factor E^^ to achieve optimal weight 
in the estimate of AE. This weight correlates with galaxy 
properties and, consequently, will change the mean calibra- 
tion bias for a given sample population. 

(iii) The range of observing conditions in the SDSS is 
quite broad. In practice these results might depend in detail 
on simulating a range of observing conditions. 

(iv) The real data use r and i band averaged shapes. In 
practice these might have different shear calibration biases 
than the i band shapes used here, because of colour gradients 
changing the observed morphology slightly, and because of 
the different S/N of detection in the two bands. 

(v) As demonstrated in Sec. 19.31 the estimate of shear 
calibration depends on how the sample is cut in S/N. Thus, 
our crude a^ cut must be replaced with a more realistic ap- 
proximation of the sample selection used in the SDSS data, 
based on the model fiux from Photo. 

(vi) Using a realistic simulation, we should be able to un- 
derstand the impact of selection biases in determining the 
shear calibration. Because the simulations described here 
have several atypical features (particularly the use of 90 de- 
gree rotated galaxy pairs), the selection biases do not oper- 
ate in the same way here as in real data. 

A precise calculation of the shear calibration bias for 
the real data would have to account for these effects. 



10 DISCUSSION 

We have described new, publicly-available software, SHERA, 
that can be used to simulate (optionally sheared) ground- 
based images with realistic morphologies and any PSF that 
has worse resolution compared to COSMOS. This software 
is independent of modeling assumptions about what galax- 
ies or PSFs look like, properly handles the pixel response 
functions, and has been tested to sub-per cent precision 
(Sec. [8)). The code has been publicly released, along with 
CTI-corrected, cleaned COSMOS galaxy postage stamps for 
a flux-limited sample at F814W < 22.5. This code should 
be highly useful for realistically assessing the systematics of 
lensing analysis (or indeed any other detailed image analysis, 
such as galaxy profile determination) in ground-based data, 
including the effects of a range of observing conditions. It 
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should also allow for assessment of parameter uncertainties 
and degeneracies, and selection biases, via the generation 
of many noise realizations for a single galaxy. As a basic 
demonstration, we have shown a crude estimation of shear 
calibration bias for SDSS single-epoch lensing analysis. 

In order for the code and data that are described in this 
paper to be more broadly applicable, there are several im- 
provements that would be needed. First, we would like the 
ability to simulate data in ot her passbands bes ides i (for 
which there is precedent, e.g. iFerrv et al.l 120081 used data 
from the Ultra Deep Field to do so). Realistic galaxies have 
colour gradients that will cause an intrinsically different ap- 
pearance in significantly different bands. While this may be 
unimportant for current surveys that have statistical errors 
that are > 5 per cent, future surveys such as LSST that 
aim for better than 1 per cent precision in the lensing signal 
have correspondingly a need for very well-constrained shear 
systematics. Thus, such effects must be handled realistically, 
and it is possible that sufficient well-sampled data^^l in other 
bands and in random fields exists in the HST archive that 
could be used for this purpose. We specify random fields 
because those that were selected due to, e.g., presence of a 
galaxy cluster, may not have a representative morphological 
mix. There is additional value to obtaining data in other, 
random fields, given that even with the relatively large (for 
HST) size of the CO SMOS field, it still exhibits significant 
cosmic variance (e.g.. lKovac et al.ll201Gf ) in the redshift dis- 
tribution, which may manifest at some level as an atypical 
morphology distribution. 

Additionally, we need a way to simulate a deeper sam- 
ple. Presently we have stopped at / < 22.5 because the 
method described here requires modification if the input 
images have non- negligible noise, and the COSMOS data 
have S/N > 50 for that magnitude range. Thus, we require 
either a generalization of the method to lower S/N in the 
input images, or we require deeper input data than is avail- 
able in the COSMOS field. In some cases, the former would 
be possible: since the SHERA algorithm is a linear opera- 
tion on the COSMOS postage stamps, one could propagate 
any noise covariance matrix through the pipeline and arrive 
at an output covariance matrix Njf on the output postage 
stamps. If the noise covariance in the data we wish to sim- 
ulate is Nij^'^ , then so long as N — N^^ is semi-positive 
definite, one could add in the appropriate additional noise 
and thereby extend the methodology of this paper to noisy 
input data. The software implementation of such a method, 
exploration of its range of applicability, and investigation of 
complications such as non-Gaussian noise is left to a future 
paper. In the opposite case, namely that where the input 
data has more noise in some mode than the data we wish 
to simulate, it seems likely that the problem is hopeless and 
deeper input data would be required. 

Despite the need for future work to make the code 
and/or input dataset as useful as possible for lensing sur- 
veys that are coming up on the timescale of ~ 1 decade, we 
anticipate that SHERA vl.O has numerous applications for 
better understanding of current data and those surveys that 
are starting in the next year, such as KIDS, HSC, and DES. 



^^ Data witii some instruments may not have a Nyquist-sampled 
PSF, depending on the choice of dither pattern. 
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APPENDIX A: SDSS IMAGING PROPERTIES 
USED FOR SIMULATIONS 

In order to mimic SDSS data, we have determined several 
properties of the SDSS data at the position of each galaxy. 
While not all users will want to mimic specifically SDSS 
data, we include this information with the data release as 
well. 



Al SDSS observations of the COSMOS field 

It is worth noting that there are several atypical aspects 
to the SDSS imaging in that region. First, the median see- 
ing is slightly better than typical for the SDSS survey as a 
whole (by ~ 10 per cent) although in fact the range of see- 
ing values is rather broad. Second, the sky level and there- 
fore the photometric noise at fixed magnitude is higher than 
usual for SDSS. As a consequence, the object detection and 
star/galax;y separation are somewhat less efficient than in 
most of th e survey area for r > 21 or i > 20.6 (for more 
details, see lNakaiima et al.ll2011h . 



A2 SDSS PSF 

The SDSS PSF is determined for all galaxies in an SDSS 
field by the psp (postage stamp pip eline) using a proce- 
dure described in Tupton et al.l (|200ll ). In brief, it involves 
modeling the temporally and spatially varying PSF using a 
Karhunen-Loeve (KL) transform, which uses a set of bright 
stars to determine basis functions and then to fit their coor- 
dinates to spatially varying (quadratic) functions. The infor- 
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mation about the basis functions and how their coefficients 
vary across a field is included in psField files. 

Thus, for each galaxy in our COSMOS sample for which 
we have postage stamp images, we find its position in SDSS 
imaging. Some of these galaxies are too faint to be detected; 
however, given their position on the sky we can still deter- 
mine precisely the CCD position at which they would have 
been detected in SDSS. Given this information, we obtain 
a postage stamp image of the galaxy i-band PSF using the 
publicly available read_psf C codqfj that reconstructs the 
basis functions and the variation of the coefficients across 
the field from the SDSS psField files. 



SDSS as being a random, uncorrelated Gaussian noise field 
with variance given by 

sky 



2 
fsky 



gam 



+ 0-da 



(A2) 



where the first term results from the Poisson noise due to the 
photons in the sky, and the second is due to the dark current 
(current that builds up due to heat even in the absence of 
photons). 



A3 Photometricity 

As stated previously, roughly 33 per cent of the SDSS imag- 
ing in the COSMOS field is classified as non-photometric 
according to the ubercalibration (Padmanabhan ct al. 20081 ) 
procedure on the rerun 301 (DR8) reductions. There are four 
SDSS fields overlapping the COSMOS region: 1462 and 1907 
include most of the galaxies, and 1458 and 2125 each cover a 
small fraction of the area. All of 1462 and 2125 in the COS- 
MOS region are classified as photometric, but only part of 
1907 and none of 1458 in that area are photometric. When 
determining the photometric ofi'set between COSMOS and 
SDSS photometry we must be careful to exclude the regions 
that are non-photometric, and the data release includes in- 
formation about photometricity. 

A4 Photometric calibration 

We start with the total galaxy magnitudes from COSMOS 
with an offset to convert from FSUW to SDSS i (Sec- 
tion [57T|. To determine the total number of i-band counts 
to assign to a galaxy of this magnitude, we first determine 
the relevant number of nanomaggies using 



mag = 22.5 — 2.5 logj^g [fiux (nanomaggies)]. 



(Al) 



We then use the SDSS photometric calibration (in units of 
nanomaggies per count) from ubercal in the relevant SDSS 
run, camcol, and field for each position. This will allow us 
to generate images with units of counts per pixel. 



A5 Noise level 

When determining the level of noise to put into the fake 
data, we ignore the noise in the COSMOS observations, 
which is very small relative to that in SDSS (as stated in 
Sec. 15.21 the faintest magnitude that we use in COSMOS 
corresponds to a minimum S/N ~ 50, and most are > 100). 
Accounting for it in detail would be quite challenging given 
that it exhibits non- negligible pixel-to-pixel correlations af- 
ter we convolve it with the SDSS PSF. We have confirmed 
that for these S/N levels, the noise fields that result from 
adding the desired level of uncorrelated Gaussian noise are 
statistically consistent with the noise fields we hope to in- 
troduce; that is, the KS test shows no deviations due to the 
noise in the original COSMOS postage stamps. 

For these simulations, we approximate the noise in 
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